UKE 
MATHEMATICAL 





LEONARD CARLITZ 


JOSEPH MILLER THOMAS 
Managing Editor 


WITH THE COOPERATION OF 
R.P. BOAS, JR. J. W. GREEN W. T. MARTIN 
H. 8: M. COXETER G. A. HEDLUND F. J. MURRAY 
J.B. DOOB N. LEVINSON GORDON PALL 
J.J. GERGEN E. J. McSHANE J. H. ROBERTS 
C.C.MacDUFFEE J. W. TUKEY 


Volume 9, Number 1 
MARCH, 1942 


ig 
Pa 
z 
*, 
ws 
rs 





DUKE MATHEMATICAL JOURNAL 


This periodical is published quarterly under the auspices of Duke University 
by Duke University Press at Durham, North Carolina. It is printed at. Mt. 
Royal and Guilfo.d Avenues, Baltimore, Maryland, by the Waverly Press. 

Entered as second class matter at the Post Office, Durham, North Carolina. 
Additional entry at the Post Office, Baltimore, Maryland. 

The subscription price for the current year is four dollars, postpaid; back 
volumes, five dollars each, carriage extra. Subscriptions, orders for back num- 
bers, and notice of change of address should be sent to Duke University Press, 
Durham, North Carolina. 

Individual and institutional members of the Mathematical Association of 
America may subscribe to the current volume at half price. To get the reduced 
price, orders for subscriptions must bear the mention ‘ Member MAA.” If an 
order at the reduced price is placed through an agent, the purchaser must pay 
any commission charge incurred. 

Since 1935 the Mathematical Association of America has given the 
Duke Mathematical Journal an annual subsidy, in return for which 
the half-rate has been allowed. Having served its purpose of aiding 
the establishment of the Journal, the subsidy is to be discontinued at 
the end of 1942. In view of the help already received from the Asso- 
ciation, however, Duke University Press will for at least five years con- 
tinue to allow the half-rate to any one who in 1942 is a subscriber at 
the reduced rate provided his subscription to the Journal and mem- 
bership in the Association remain unbroken. This arrangement is ex- 
pected to be permanent, but the Press reserves the right to modify or 
withdraw it after the five years and to change the basic rate at the 
beginning of any calendar year. 

Manuscripts and editorial correspondence should be addressed to Duke 
Mathematical Journal, 4785 Duke Station, Durham, North Carolina. An 
Author’s Manual containing detailed information about the preparation of 
papers for publication will be sent on request. 

Authors are entitled to one hundred free reprints. Additional copies will be 
supplied at cost. All reprints will be furnished with covers unless the contrary 


is specifically requested. 


The American Mathematical Society is officially represented on the Editorial 
Board by Professors Murray and Ward. 


Made in United States of America 


WAVERLY PRESS, INC. 
BALTIMORE, U.8. A. 











— 


es -] 





¥ 


A COMPARISON OF LINEAR MEASURES IN THE PLANE 


By Seymour SHERMAN 
cr 


In generalizing from the notion of the length of curve and linear measure of 
a linear set to the linear measure of a plane set the following considerations arise: 

1. Does the new measure give the expected results for point sets which can 
be treated by the old methods? 

2. Is the new measure invariant under Euclidean transformation of the set? 

3. Does the new measure have the usual measure properties, i.e., is it com- 
pletely additive; does it satisfy the general Carathéodory measure postulates? 

Some such generalizations satisfying these and more subtle’ requirements 
have been proposed by Carathéodory, Gross, Steinhaus,” Favard, Kolmogoroff,’ 
Appert, Randolph, and Morse. Of these measures the ones associated with 
Carathéodory, Gross, Appert, Randolph, and Morse are closely related and in- 
volve countable decompositions of the given set while the ones suggested by 
Kolmogoroff and Steinhaus diverged into different paths. Kolmogoroff measure 
originated, in part, with the notion of Schmidt [10] that the measure of a con- 
tracted set is not greater than the measure of the set. Steinhaus measure orig- 
inated (1) with the measure (see footnote 6) of sets of lines used for the Buffon 
needle problem and (2), surprisingly enough, with a mechanical device’ for 
measuring lengths of curves as seen under a microscope. 

We are mainly concerned with the relationship between Carathéodory linear 
measure and Steinhaus linear measure. We complete Steinhaus’ proof it 
the Steinhaus measure of a rectifiable Jordan curve is equal to twice its © 1°°4 
as defined by the inscribed polygon approach. In Theorem 4, we show +t, 
contrary to Steinhaus’ expressed belief,” there are sets (irregular in the sense: of 
Besicovitch) whose Steinhaus measure is different from twice their Carathéodory 
linear measure, and, in general, if a set is measurable Carathéodory, then its 
Steinhaus linear measure is equal to twice the Carathéodory linear measure of 
its regular part. Unlike other linear measures, Steinhaus linear measure may 


Received January 2, 1941. The author is indebted to Prof. J. F. Randolph for many 
helpful suggestions. 

1 One such requirement is that the outer linear Lebesgue meagure of the projection of the 
set on any line be not greater than the outer measure of the set. For a discussion of this 
requirement see [5]. Numbers in square brackets refer to the bibliography. For con- 
siderations involving a natural generalization to higher dimensions see [7] and [8]. 

2 See [12] and [13]. This measure is completely additive over the family of sets measur- 
able Steinhaus but, since it is not introduced by means of an exterior measure function, the 
Carathéodory measure postulates do not apply. This measure was later independently 
suggested by Favard [4]. 

3 Kolmogoroff measure [6] is defined merely for analytic sets and so the general Cara- 
théodory postulates do not apply. It is completely additive over analytic sets. 

4 See [12]. 

5 See [13], p. 354. 








2 SEYMOUR SHERMAN 


be applied not only to find a linear measure for a point set but also to find a 
length for a parametrized curve. In Theorem 5 we establish by means of 
Steinhaus measure a very natural relation between the length of a rectifiable 
curve and the Carathéodory measures of the sets of its multiple points. 


Deltheil measure.’ Let line / in the plane have coordinates (p, 8) assigned to 
it in the following manner. If / is not vertical and does not pass through the 
origin or if / is vertical and lies to the left of the origin, if p (p > 0) is the dis- 
tance from the origin to the line and 6 ‘27 > @ > 0) is the angle made between 
the horizontal ray through the origin und the normal from the origin to 1, then 
(p, @) are the coordinates of /; if 1 is vertical and passes through the origin, 
then / has the coordinates (0, 0), (0, +), and (0, 27); if l is vertical and lies to 
the right of the origin, and p (p > 0) is the distance from the origin to l, then 1 
has the coordinates (p, 0) and (p, 27); and if / is not vertical and does pass through 
the origin and @ (rt > @ > 0) is the angle made between the initial line and 
the undirected normal, then / has the coordinates (0, 6) and (0,@ + 7). If S 
is a set of lines such that 


(1) D(S) = II dp dé, 


(p,0)€8 
where the integration is in the sense of Lebesgue, exists, then we say that S 
is measurable (D) and D(S) is its Deltheil measure. This measure is invariant 
with respect to Euclidean transformations of S. 


Steinhaus measure. [If A is a set of lines in the plane, we define f.4(p, @) to be 
a function of lines whose value is the number of points of A on the line (p, 6). 
Thus f.(p, ) has the value 0, n (positive integer), or +2. If fs(p, 0) is in- 
tegrable Lebesgue, let 


(2) sii) = | f f1lo, 0) do ao, 


where the integration is taken over the whole (p, 4)-plane. This Steinhaus 
measure is invariant under Euclidean transformations of A. It is easy to see 
that the Steinhaus measure of a line segment is equal to twice its length and 
‘that the Steinhaus measure of a polygon is equal to twice its perimeter. Stein- 
haus stated that if A is a continuous rectifiable curve, then 


(3) St(A) = 21(A), 

where /(A) is the length of A calculated by polygonal approximation and St(A) 
is calculated by giving to each point of A the multiplicity it gains by the para- 
metrization. After a few definitions and lemmas we shall give the complete 
proof. 


6 See R. Deltheil, Probabilités Géométriques, Paris, 1926. 











COMPARISON OF LINEAR MEASURES IN PLANE 3 


For a plane set A, 


af (p.8) 
(5) Cr4)(p, 0) = acharacteristic function of T(A), 
and if (R, a) are the polar coordinates of a point in A, then 
(6) T(R, a) = E [p = Roos (a — 6), p 20,27 > 0620). 
(9,8) 
It is easy to see that 
(7) TA)= 2 T(R, a) 
(R,a)eA 
and 
(8) (A) = 7 DA’, 
7¢€E£ 7¢€£ 


where the A’ are plane sets and the index j can have any range E whatsoever. 
Lemma 1.’ Jf mc-(A) = 0, then St(A) = 0. 


Proof. If me(A) = 0, then there exists a sequence {(@;} of countable cover- 
ings {(@;} of A by squares such that S; — 0 asi— « where S; is the sum of the 
sides of the squares in the covering @;. Obviously 


T(A) C T(Q), 
where 7'(@,) is the union of the transforms of the squares in the covering @; , and® 


| T(A) |\2 = | T(Qi) 2 > 0. 
Thus 7(A) is measurable (Lebesgue), | 7(A) |2 = 0, and If Salp, 9) dp dé = 0. 


Lemma 2. If C is a continuous rectifiable curve parametrized according to arc 

length, then 
pw l ; 1 d 

E sails to exist or ©Y fails to exist or both ~ and <4 equal zero = 0, and 
|s Lds ds ds ds 1 
the Deltheil measure of those lines which are tangent to the curve at points at which 
= and ow both exist and are not simultaneously null is zero. 
8 ds 
Proof. After placing the curve in the first quadrant at a positive distance 
from the origin, parametrize it according to arc length, in Cartesian coordinates 


z = 2(8), y = y(s), 0OSs8s38 &, 


7™mc(A) = ay Carathéodory linear measure of A. For definition and properties, see 
[3] and [9]—especially pp. 53-54. 
8|B\; = ay Lebesgue i-dimensional measure of B. 








4 SEYMOUR SHERMAN 


and in polar coordinates 
R= R(s), a = a(s), OSs. 
We note that x(s), y(s), R(s), and a(s) all satisfy Lipschitz conditions of order 1 


and so are absolutely continuous. Also z(s) > 0, y(s) > 0, M > R(s) > 0, 
tx > a(s) > 0. Therefore, if 6 is fixed, 


(9) R(s) cos [a(s) — 6] 


satisfies a Lipschitz condition of order 1. By Saks, Theory of the Integral, 
dx 


ds 


the absolute continuity of x(s) and y(s), and the second part of 





Chapter IV, (8.4), (iii), — and “ exist for almost all s. From the measurability 
ds 


dx dy 
f .- an =, 

ds ds 
dx 


. ‘ dy ; 

Saks (8.4), (iv), a and a are simultaneously zero on a set of s-values of measure 
ds ds 

zero. Thus the first part of our lemma is proved. 


sn ly lx\ dx ‘ , _— 
Since arc tan (“%Y /@ : = ~ 0, is a measurable function of s it is easy to see 
ds/ ds/’ ds 
that | E [@ = @,] |, = 0, where @, is the @ corresponding to the line tangent 
(0,8) 
to the curve at (z(s), y(s)). Consider now the transformation 


6 = 8, 


(10) 
R(s) cos [a(s) — 6]. 


p 


This transformation is continuous. Cover E = E [@ = @,] by an open set E(e) 
(0,8) 


such that | E(e) |, < ¢. E(e) is open and so is an’ F,. Its transform r(E(e)) 
in the (p, @)-plane is an’ , and so is measurable Lebesgue. By the Fubini 


| r(E(e)) |2 = a Creo) (p, 9) ap dé. 


Since R(s) cos [a(s) — 6] satisfies a Lipschitz condition of order 1, there exists 
a constant M depending only on the curve such that 


theorem 


[ Crcon(o, 6)dp s u | Cx (8, 8) ds, 


and so 


| r(E(e)) | S M| Ele) |2 S Me. 
Thus | r(Z) |, = O and the second part of our lemma is proved. 


® See [11], Theorem 52a. 
10 See [11], Theorem 41. 











COMPARISON OF LINEAR MEASURES IN PLANE 5 


THEOREM 1. If C is a continuous rectifiable curve, then 
St(C) = 21(C). 

Proof. Parametrize C according to are length. Consider a sequence of ap- 
proximating polygons (with vertices on the curve) formed by interpolating more 
and more vertices, where the maximum side approaches zero in length, and 
where J, (the length of the n-th approximating polygon) approaches / (the length 
of the curve) as n — «. If g,(p, @) is the number of times (calculated with 
the proper multiplicity) that (p, 6) intersects the n-th polygon, then 


gn+i(p, 9) 2 ¢nlp, 9), 
except for at most a finite set of lines (the sides of the n-th polygon). We now 
show that for a sequence satisfying the above conditions 
(11) lim ¢nlp, 0) -_ Sole, 6), 


n—?o 


except for at most a set of lines tangent to the curve through points of the curve 
o fails to exist, dy fails to exist, or es = dy = (0. As can be seen 
ds ds ds_ ds 
from Lemmas I, II, and Saks (8.4), (ii), the Deltheil measure of this set of lines 
is zero. 

We wish to show that, except for the set of lines of Deltheil measure zero, 
equation (11) holds. Suppose (po , 4) is a line not in the set of measure zero. 


If Foleo , %) = 0, it is not difficult to see that gn(po , %) = 0. If Foleo ,%) = K 


where either 


(0 < K < o), then we can find an increasing (or decreasing) sequence of 
different s,’s, such that (z(s;), y(si)), 1 S « S K, is on line (po, 6). If 
f (Po , %) = ©, wecan find by the procedure generally used to prove the Bolzano- 


Weierstrass theorem an increasing (or decreasing) sequence of different s,’s, 
such that again (x(s;), y(s:)), 1 S i < ©, is on the line (po, %). If there exist 
s, and s, such that 0 < 8,1 <8; $8: S 8; <8 S l(@) and s, and s, deter- 
mine succeeding vertices in the n-th polygon, then we say that s; is effective in 
¢n(po, %). It is obvious that if s; is effective in gw(po , %), then s; is effective 
in each (po , %), NN <n. Forn large enough s; is effective in ¢,(p0 , %). Let us 


suppose that (2 > 0 and (24) > 0. Then there exists an 7 such that 
S=n8j 


S / suns; 
(12). y(sj1) < y(s) < y(s;), 281) < 2(s) < 2(8;), 8; ~-2n S8<8;; 
(13) y(si) < y(s) < y(8j4), 2(8;) < 2(8) < 2(8j41), 8; $8 <8; +7; 


and 
ue) =v) _ (48 | < min [|tan (o + 4) - (#) |, 


AX] sms; | dz 





(14) 


| (d | 
(#)__ |]: 0 <|3,—8| <7. 


Lo) el 











6 SEYMOUR SHERMAN 


This means that, if s lies in the range described by (14), the points will be in a 
sector with (z(s;), y(s;)) as center such that (pp, %) is not in the sector and 
such that the upper half of the sector will be above the horizontal line through 
the center and to the right of the vertical line through the center. The range 
described in (12) has points only in the upper half-sector and so on one side of 
the line (po , %); the range described in (13) has points only in the lower half- 
sector and so on the other side of the line (pp , %). For n large enough, say N’, 
the maximum arc length between successive vertices is less than n, and s; will 
be effective in gw-(po , %). If s; does not determine a vertex of the n-th polygon, 
then a pair of succeeding vertices will be in opposite half-sectors, and by a 
topological argument the side joining them will be intersected by (po , %). Thus 
after disposing of the other cases, e.g., dy = 0, “ < 0, we see that s; “con- 
ds ds 
tributes to” ¢,(po, %) for n large enough. And by induction 
lim ¢n(po, %) = Shoo, 6). 


no 


But 


21(C) = lim 2l, = lim I! ¢n(p, 6) dp dé 


n—>o n> oo 


= If lim ¢n(p, 0)dp dé = If foe, 6)dp de = St(C), 


and the theorem is proved. 

Corotiary. If Cis a rectifiable Jordan arc, then St(C) = St(C) = 2U(C) = 
2m.(C), where C is the corresponding poini-set. 

Lemma 3. If A is closed and bounded, then T(A) is closed and bounded. 


Proof. Since A is bounded, for (p, 0) « T(A), p is bounded and, of course, 
6 is bounded (2 = @ = 0). Now that we have 7(A) bounded, let us prove 
that 7(A) is closed. Let {(p;, @;)} be a convergent sequence of different lines 
of T(A). We must prove that (p;, 0:;) — (90, %)¢«7(A) as i— ~. For 
each (p; , 0;) consider one of its inverses (R; , a;) «A. Either the set {(R;, a;)} 
has only a finite set of distinct elements or it has an infinite set of distinct ele- 
ments. In the first case there would be an infinite number of distinct elements 
of {(p; , @;)} on a closed bounded curve in 7(A), and (p; , 0:;) — (po , %) € T(A). 
In the second case since A is self-compact, there is a subsequence (R; , a:) > 
(Ro , ao) ¢ A. This convergent subsequence determines a subsequence | (p: , 6;) a 
But lim (p;, 6;) = (R; cos(a; — 6;), 6) = (Ro cos(ao — %), &) & 


E [p = Ro cos (ao — 8), p = 0] = T[(Ro, ao)] C T(A). Hence lim (p; , 0;) « T(A) 


(8) = ic 


and 7A) is closed. 
Lemma 4. If A is an §, in the (R, a)-plane, then T(A) is an &, in the (p, 0)- 
plane. 








COMPARISON OF LINEAR MEASURES IN PLANE 7 


Lemma 5. If mA = a < «, then T(A) is measurable Lebesgue. 

Proof. Sincem.A < «©, wehave A = F + 0, where F is an §, in the (R, a)- 
plane and m.(0) = 0. Hence 7(F) is an &, in the (p, @)-plane. Since m,.(0) = 0, 
| T(0) | = 0. But T(A) = T(F + 0) = T(F) + T(O) and so T(A) is measur- 
able Lebesgue. 

Lemma 6 [2]. If m(A) < «, A is irregular, and Ag is the projection of A 
on line (p, @), then m.(As) = 0 for almost all 8. 

Lemma 6.1. If m.A < @ and A is irregular, then St(A) = 0. 


Proof. By Lemma 5, C7x,4)(p, 8) is measurable and by the Fubini Theorem 


T(A) |> = [ f erate 6) dp do = fly Cra(p, 8) dp |ao 


By Lemma 6 
[ Crate 6) dp = () 


for almost all 6. Hence 


and 
St(A) = 0. 


Lemma 7. If a plane curve C is rectifiable, then there exists a continuous recti- 
fiable curve C’ which is of equal length and contains C as a subset. 


Proof. Let @C be given by the equations 
x=X(t), y= Y(d, 0<it<1. 


S(C; 0, t) is a monotone, single-valued function of t,0 S ¢ <1. If this function 
is continuous, then @ is continuous. If the function is not continuous and 
is a point of discontinuity, let S(4{) = sup S(C; 0, ¢) and S(4) = inf S(@; 0, 2). 
We have S(t;) < S(&). Now parametrize @ in terms of are length so that each 
point is given by (z(s), y(s)) for some s,0 S s S %. Since S(C; 0, t) is not 
continuous, every such s does not have a corresponding point. Consider @’ as 
the curve generated by adding to C those parts of the line segments 
(x(s ), y(s )), (x(s"), y(s’)) not already contained in the curve. This new curve 
has the same length as C. When it is parametrized according to arc length, 
values of s which yielded points on @ yield the same points on C’, and values 
of s which failed to yield points on @ yield points on the new line segment. 


Lemma 8. Any regular set is almost entirely contained in a countable class of 
rectifiable Jordan arcs. 


Proof. By [1], Theorem 16 and the preceding lemma, we show that any 
regular set is contained in a set of measure zero and a countable class of recti- 








8 SEYMOUR SHERMAN 


fiable arcs. By a modification of [1], Lemma 4 (change (i) to read: G is a set 
of rectifiable Jordan ares), the rest of the theorem follows. 

Lemma 9. If m(A) < © and J is a rectifiable Jordan arc, then 2m(AJ) = 
St(AJ). 

Proof follows immediately from Corollary to Theorem 1 and the complete 
additivity of Steinhaus measure. 

THEOREM 3. If A is regular and mA) < @, then St(A) = 2m.A. 

Proof follows from Lemma 8, Lemma 9, and the complete additivity of Stein- 
haus measure. 


TueoreM 4. If m(A) < «© orif A = > A, m(A;i) < «©, and Ap is the 


i=] 


regular part of A, then St(A) = 2m,(Ag). 
Proof. If m(A) < «©, then A = A; + Ag, where A; is the irregular part 
of A and A, is the regular part of A. By Theorems 2 and 3, 


St(A) = St(Az) + St(A pg) = 0 + St(Ar) = 2m,(Ap). 


The other case offers no new difficulties. 


Multiple points. Let C be a continuous rectifiable curve parametrized accord- 
ing to are length. Let EZ; be that set of points which have exactly 7 corre- 
spondents in the s-range. 

Lemma 10. Fori S «, E; is measurable. 

Proof. Let Ej) = aE. there exists a set of numbers {7x(p)} such that 

P 1<K<i 
, sola 1 
tk(p) < tx (p), K < RK’; (x(rx(p)), y(tx(p))) = p; min | rx4:(p) — tx(p) | 2 |: 
1<K<i 
Corresponding to each p « E£;,(j), there may be many sets which satisfy the 
conditions above, but we associate with each p a unique set which satisfies the 
prescribed conditions. 

We now prove that E,(j) is closed. For each sequence {p,} such that 

pn € E;(j), Pn — p, we can choose a subsequence {p,-} such that rx(pn’) > ax, 


. —— , 1 , 
1s K Si. Since for each n’, min | rx4:(pn:) — tTx(Pn’)| 2 =, we have min 
1sK<i J 1sK<i 


hii = Gel 2 *. Thus pe E,(j) and E,(j) is closed. 
But 
Ex = 2), Ej)e Fe. 
i<K<sa j=l 
Since fori < ©, 2; = Ex- Ex is the difference between two meas- 











COMPARISON OF LINEAR MEASURES IN PLANE 9 


urable sets, we have FE; measurable. Since E, = > Ej) - Ex, we 
lsj<@ isK<o 
also have that LE, is measurable. 
TueoreM 5. If C is a continuous rectifiable curve parametrized according to 
arc length, then ’ 
l(@) = m.(E£)). 
Proof. Let &; be the set of all points of Z; counted with proper multiplicity. 


Then 21(C) = St(C) = Ero 0)dpd@ = > [fig0. 0)dpd@ = 


lsiso@ 
z 4 [{ seo. d)dpda= > iSt(E) =2 D> imE). 
lsisa lsi<oa lsi<o 


The last theorem is a sharpening of a remark made by Saks. It would seem 
that a more direct (i.e., without reverting to Steinhaus measure) proof should 
be possible but the author has not been able to find such a proof. 


BIBLIOGRAPHY 


1. A. S. Besicovitcu, On the fundamental geometrical properties of linearly measurable 
sets of points (II), Math. Annalen, vol. 115(1938), pp. 296-329. 
2. A. 8. Bestcovitrcu, On the fundamental geometrical properties of linearly measurable sets 
of points (III), Math. Annalen, vol. 116(1939), pp. 349-357. 
3. C. Caratutopory, Uber das lineare Mass von Punktmengen—eine Verallgemeinerung des 
Liingenbegriffs, Nachr. Ges. Wiss. Géttingen, 1914, pp. 404-426. 
Favarpb, Une définition de la longeur et de l’aire, Comptes Rendus, vol. 194(1932), 
pp. 344-346. 
5. W. Gross, Uber das Flaichenmass von Punktmengen, Monatshefte fiir Mathematik und 
Physik, vol. 29(1918), pp. 145-176. 
6. A. Kotmocororr, Beitrdge zur Masstheorie, Math. Annalen, vol. 107(1932), pp. 351-366. 
. F. Ranpowupu, On generalizations of length and area, Bull. A. M. S., vol. 42(1936), 
pp. 268-274. 
8. J. RANDOLPH AND A. Morse, Gillespie measure, Duke Mathematical Journal, vol. 6 
(1940), pp. 408-419. 
Saks, Theory of the Integral, second revised edition, Warsaw-Lwéw, 1937. 
. Scumipt, Uber die definition des Begriffs der Linge krummer Linien, Math. Annalen, 
vol. 55(1902), pp. 163-176. 
11. W. Sterpihsk1, /ntroduction to General Topology, Toronto, 1934. 
12. H. Srernnavs, Zur Prazis der Rektifikation und zum Laéngenbegriff, Sichsiche Akademie 
zu Leipzig, Berichte, vol. 82(1930), pp. 120-130. 
13. H. Srernnaus, Sur la portée pratique et théorique de quelques théoremes sur la mesure des 
ensembles de droites, Comptes Rendus du premier congrés des mathématiciens des 
pays Slaves, 1929, pp. 348-354. 


rs 
— 


it 
8 


= 
fm 


CoRNELL UNIVERSITY. 











LIMITS OF INTEGRALS 
By Raven PatmMerR AGNEW 


1. Introduction. Let integrals over finite intervals be Lebesgue integrals, 
and let integrals over infinite intervals be Cauchy-Lebesgue integrals defined by 


@ 4 a) B 
(1.1) [ = lim [ : / = lim . 
0 Aa “0 L— 20 A, Bo A 


In case g(t) is integrable over each finite interval, the identity 


B+ 


(1.2) / [g(t + A) — g(t) dt = / g(t) dt — [ g(—t) dt 


3 


implies that the equality 


2 A+K A+A 
(1.3) / g(t + A) — gD] dt = lim / g(t) dt — lim / g(—t) dt 


A->o Ao 


holds whenever either of the two members exists. For this and other reasons, 
facts relating to 


A+x 
(1.4) lim | f(t) dt 


A->2 
are of interest. 

Perhaps our most striking result is that if the limit in (1.4) exists for each 
in some set having positive measure, then the limit exists for each real \ and 
the convergence is uniform over each finite interval. Some applications of this 
result are given in §4 and §5. 


2. Apreliminary theorem. Our first step in the study of (1.4) is to prove the 
following theorem. 
THEOREM 2.1. Jf 


A+ 


A 
(2.2) L(A) = lim f(t) dt 


Ao A 
exists for each \ in some set having positive measure, then L(X) exists for each real d 
and L(\) = AL where L = L(1). 
The hypothesis implies existence of a number a such that f(t) is integrable 
over b S t S ¢ provided b and ¢ are greater than a; all limits of integration 
which we use are assumed to be greater than a. The identity 


Ataemhs A+\g (A+Ag—Aa)+A1 
A 


A (A+Aq—A1) 


Received May 10, 1941; presented to the American Mathematical Society, May 2, 1941. 
10 











LIMITS OF INTEGRALS ll 


with integrand f(t), implies that L(A, — 4) exists and 


(2.4) L(\2 — x) = Lire) — L(Ad) 

whenever L(\;) and L(d2) both exist. Since the set E of values of \ for which 
L(x) exists has positive measure, there is a positive number 6 such that each 
number Xo for which | Ay | < 6 is representable in the form Ay = A» — A, where 
A, and A; are points of EZ." Hence L(A) exists when |X| < 6. The identity 


A+, +2 A+, (A+A1)+A9 
(2.5) / -/{ +f 
A 4 (A+A}1) 


with integrand f(t), implies that L(A, + 2) exists and 
(2.6) LAr + Ae) = L(A) + Le) 


provided L(A;) and L(Az) both exist. It is now easy to show that L(A) exists 
for each real A, and that (2.4) and (2.6) hold whenever \; and 2 are real. From 
(2.4) we see that L(A) is continuous everywhere or discontinuous everywhere 
according as L(A) is continuous or discontinuous at \ = 0. Since L(A), being 
the limit of the sequence of continuous functions obtained by giving integer 
values to A, cannot be discontinuous everywhere it must be continuous every- 
where. The functional equation (2.5) implies, in a familiar and simple way, that 
L(r) = rL(1) when r is rational; continuity of L(A) then implies that L(A) = 
AL(1) for each X and Theorem 2.1 is proved. 


3. Uniformity of the convergence. In this section, we prove the following 
theorem. 


THEOREM 3.1. Jf 


A+’ 
(3.2) lim / f(t) dt = AL, —x <A< ow, 


Aa 


then the convergence in (3.2) is uniform over each finite interval -a SX S 


A a. 
Let a be a fixed positive number, let Z, denote the interval —a S X\ S&S a, 
and let the measure of E, be denoted by | E; | so that | Z, | = 2a. By a theorem 
of Egoroff, the convergence in (3.2) is essentially uniform over EF; ; that is, to 
each 6 > 0 there corresponds a subset F of FE; such that | | > | E,| — 6 and 
the convergence is uniform over E. Let 6 and E be fixed such that 6 < 4a 
and accordingly | Z| > 3a. Let A(e) be a function, defined for « > 0, such that 


At+A 
(3.3) / fle) dt — AL} < 46 \eE, A> Ale). 


1 This fact, first proved by Steinhaus, Fundamenta Mathematicae, vol. 1(1920), pp. 
93-104, has since received very simple proofs. One chooses a point at which the density is 
greater than 3} and applies the idea following equation (3.4) below. It is an interesting 
fact that some sets of measure 0, notably the Cantor middle-third set, have the essential 


property which we are using. 











12 RALPH PALMER AGNEW 


Let Xo represent any point in the interval —a S \} S a. Then points d, and 
\: of the set F exist such that 


(3.4) do = de — Mi . 


To prove this, we observe that if such a representation of \») were impossible, 
then the set E could have no points in common with the set Ey obtained by 
translating the set E to the right |Ao| units; this would lead to the absurd con- 
clusion that E and Ey are two disjoint subsets of the interval —a S A S 2a 
each having measure greater than ja. 

Use of the representation (3.4) and the inequality (3.3) gives, when A > 
a+ A(e) and —a S \ Sa, 


A+Xo | pAthe—Ai 
| fi) dt — %»wL| = | fit) dt — (te — as)L 


A 


| A+hg (A+Aq—A1) +A1 
. {f fliyat — rob} — { f fiat — 1 <e 
A 4 (A+A2—Ay) 


and the uniform convergence is established. 
Because of the identity (1.2), it is a consequence of Theorems 2.1 and 3.1 


that if 


B 


lim J] lot +») — gat 


A,B-2 


exists for each \ in some set having positive measure, then there is a constant M 
such that the limit is AM uniformly over each finite interval of values of X. 


4. Two theorems of Iyengar. In this section we prove two theorems which, 
as may be seen by making an exponential change of variable and suitable changes 
in notation, imply and are implied by a theorem and other results of Iyengar.’ 
The proofs of Iyengar are ingenious; by making use of Theorem 3.1 we obtain 
simpler proofs.’ 

THEoreM 4.1. A necessary and sufficient condition that 


A+xX 
(4.11) lim f(t) dt 
Aw A 
2K. S. K. Iyengar, On Frullani integrals, Proc. Cambridge Philos. Soc., vol. 37(1941), 
pp. 9-13. The condition 


€ 
iim SO og = Beep, p>d, 
u 
€p 


of Iyengar becomes (4.21) when we set 
t = log u=!, A = log «1, = log p"', L = —B, f(t) = o(e). 


3 The author must confess that these theorems seem strange to him; he and some of 
his colleagues feel it to be incredible that no Tauberian conditions are involved in the 
theorems. 





Vv 








LIMITS OF INTEGRALS 13 


exist for each real d is that 

(4.12) lim e | f(t)e~‘ dt 
Ao A 

exist. 


THEOREM 4.2. A necessary and sufficient condition that 


A+h 
(4.21) lim f() dt = dL, —-x <A< a, 
A-ao YA 
is that 
(4.22) lim e4 / fitetdt = L. 
Ao A 


Because of the equality 


e “ [ soeta = [sore a 
(4.23) : i 
-[ fit + Ade dt, 
0 


which holds whenever any one of the integrals exists, the conditions (4.12) and 
(4.22) can be put in different forms. 

Using Theorem 2.1, we can see that Theorem 4.1 is a corollary of Theorem 
4.2. We prove Theorem 4.2. To prove necessity, let a be a fixed positive 
number; we could take a = 1. Then, by Theorem 3.1, to each « > 0 corre- 
sponds a number Ap = Ao(e) > 0 such that 


A+A 
(4.24) [Hoa -rL) <4), A> 4,055 
Let 
(4.25) fi) =f — L, 
so that (4.24) may be written 
A+A 
(4.26) [fod <aa-), A>A,05ASa. 
A 


Let x and A be momentarily fixed such that x > A > Ao, and choose an index N 
such that 


(4.27) A+Na<x5A+(N + la. 


We are going to use the inequality 


a (fle \f +S |e + oll 











14 RALPH PALMER AGNEW 


ith i “fide Jsi 4 , 
with integrand e*f,({)e". Using the second mean value theorem‘ we obtain 
for each n = 1,2,---,N 


A+na 


fn A+na 
filve‘dt =e * er / fi) dt + @*™ / filt) dt, 
En 


A+(n—lDa 


(4.20) | 


A+(n—Da 


where &, is a properly chosen point between A + (n — l)aandA + na. Using 
(4.29) and (4.26) we obtain 


A+na 
(4.30) e! / fultentdt) < (1 — ee“, 
A+(n—la 

Likewise 
(4.31) | filt)e‘dt| < 1 — &“)e~™”. 
A+Na 
From (4.28). (4.30), and (4.31) we obtain 
(4,32) e / filden'dt| < «, z>A> Ade). 
A 
Since Ao(e) was chosen greater than 0, this implies that 
(4.33) / file‘ dt| <«, zt>A> Ade); 
A 
and hence the Cauchy criterion for convergence implies existence of 


(4.34) [ noertat 


for each sufficiently great constant B. Hence we can let x become infinite in 
(4.32) to obtain 


(4.35) e! / filden'dt| < « A > Ale), 
| A 
and therefore 
(4.36) ef / fe 'dt — L| <« a> aaa 
| A 


This completes proof of necessity for Theorem 4.2. 
To prove sufficiency, let 


(4.37) F(t) = e’ [soe du; 


‘ In case f(t) is complex valued, we obtain the results separately for the real and imagi- 
nary parts of f(t) and Theorem 4.2 then follows. 








al 


(5 





in 














LIMITS OF INTEGRALS 15 


our hypothesis (4.22) then becomes 
(4.38) lim F(t) = L. 


t-—>e 


Differentiating (4.37) gives for almost all ¢ (that is, for all ¢ except those in a 
set of measure 0) 


(4.39) F’(t) = F(t) — fi. 


Integrating over the interval with end points at A and A + X gives 
A+A A+ 

(4.40) FA+d)-F(A)=[ Fa - [ oa. 
A A 


Because of (4.38), we can let A become infinite in (4.40) to obtain (4.21). This 
completes proof of Theorems 4.1 and 4.2. 


5. A Tauberian theorem. We now use Theorem 4.2 to prove the following 
theorem which could be easily generalized by replacing F(t) by F(t) — L, 
L being a constant. 


THEOREM 5.1. If F(t) is absolutely continuous over each finite interval and 


A+A 

(5.11) lim / (F(t) — F"()| dt = 0, -« << 6, 
A-ao YA 

and 
(5.12) lim e‘ F(t) = 0, 

t—20 
then 
(5.13) lim F(t) = 0. 

t—>o 


To prove this theorem, put 
(5.14) f() = Fd) — FO. 
Then, by Theorem 4.2, 


(5.15) lim é | f(we™ du = 0. 
t-*0 t 


Multiplying (5.14) by the integrating factor e~‘ gives 


(5.16) 5 eF(t) = —f(te* 


and, since (5.15) implies existence of the integral on the right, 


(5.17) e F(t)=c+ [ sor du. 








16 RALPH PALMER AGNEW 
The result of letting ¢ become infinite shows that c = 0. Hence 
(5.18) F(t) = | flu)e™ du 

t 


and our result follows from (5.15). 

That the Tauberian condition (5.12) cannot be removed from Theorem 5.1 
is an obvious consequence of the fact that the function F(t) = e‘ satisfies (5.11) 
but not (5.13). 


6. Bounded functions f(/). It was pointed out to the author by R. P. Boas 
that, in case f(t) is bounded and measurable and the equality 
A+\ 


(6.01) lim f()) dt = XL, 


Aw VA 
in which L is a constant, holds for two values \; and 2 of » for which d;/)¢ is 
irrational, an application of a Tauberian theorem of Wiener establishes the 
equality 
0 
(6.02) lim | f(je' dt = L; 
Ao A 


in this case (6.01) holds also for all values of X. It is a consequence of a Tau- 
berian theorem of Wiener’ that if K,(t), K2(t), K3(¢) are three functions having 
absolutely convergent integrals over —«© < t < ©, if 


(6.11) / K(t) dt = 1, j =1, 2, 3, 


if the Fourier transforms of K,(t) and K.(t) have no common zeros, and if f(t) 
is a bounded measurable function for which 


(6.12) lim Kit — A)f()\dt = L 

when j = 1, 2, then (6.12) holds when j = 3. On setting, when j = 1, 2, 
K,(t) = dj" for0 St <X,;, 
K,(t) = 0 otherwise, 


and 


K;(t) = e‘ for t = 0, 
(6.13) 
K;(t) = 0 for t < 0, 


we obtain the conclusion (6.02). If we set, for A > 0, 
Kx) =r" for0 StS), 
K;(t) 
5N. Wiener, The Fourier Integral, Cambridge (1933), p. 75, Theorem 6. 


1 otherwise, 








1e 














LIMITS OF INTEGRALS 17 


we obtain (6.01) for \ > 0; and it then follows easily that (6.01) holds when 
4 <0. That (6.02) implies (6.01) follows, in case f(z) is bounded, from Theo- 
rem 4, p. 73, of Wiener’s book and the fact that the Fourier transform of the 
function K;(t) in (6.13) has no zeros. 

Even when f(t) is not assumed bounded, the hypothesis that (6.01) holds 
when \ = A; and when \ = 2¢ is no less general than the hypothesis that the left 
member of (6.01) exists when \ = \; and when \ = dX». + This is a consequence 
of the following theorem. 


THEOREM 6.2. If 


A+A 
(6.21) lim [ f(t) dt 
Ao VA 
exists for some X # 0, and ty is fixed such that f(t) is integrable over each finite 
interval St S B < «@, then 


B 
(6.22) lim 5 | oat 
Bow B to 
exists; moreover the equality 
A+h l B 
(6.23) lim / f(jjdt = d lim | f(H dt 
Ao A Boo B to 


holds for each > for which the left member exists. 


Assuming first that \ is a positive number for which (6.21) exists, we prove 
(6.23). On denoting the limit by AL and setting f,(t) = f(t) — L, we obtain 


lim [~ fi()dt = 0. 
Aw 4A 
Suppose « > 0 and choose A» > t such that 


+A | 
Silt) dt| < de/2, A> Ao. 


A 


Corresponding to each B > Ao, let ¢(B) and N(B) be numbers such that 
Ay S ¢(B) < Ao + Xd, N(B) is an integer, and 


B = ¢(B) + N(B)a. 


Then, when B > Ap, 


2B ¢(B) N(B) ¢(B)+ndr 
[ poa=[ soa+ d | fd) dt 
to to n=l ¢(B)+(n—1)A 


so that 


Te N(B)ye 
(6.24) al sat) s 5+ EM 








18 RALPH PALMER AGNEW 


where M denotes the maximum over the interval Ap S u S Ap + Xd of the con- 
tinuous function 


Lf, oa, 


Hence the left member of (6.24) converges to 0 as B > ~; and, using the fact 
that fi(t) = f(t) — L, we obtain 


l B 
lim / f()dt = L. 


Boo 
Since the limit in (6.21) is AL, we obtain (6.23). The case in which A S 0 now 
follows easily and Theorem 6.2 is proved. 

Certain functions of the form 

f) =L+t*sn’’ 
show that the hypothesis that (6.01) holds for each \ does not imply that f(t) 
is bounded. Other examples of functions f(t) for which (6.01) holds for each X 
have the form f(t) = no, when n is a positive integer and n S$ t <n +n” 
and f(t) = 0 otherwise, the numbers o; , o2 , -- - forming a real bounded sequence. 
It is easy to determine the sequence ¢, in such a way that the functions f(t) and 


[ 10 dt 


each have, as their arguments become infinite, inferior limit — « and superior 
limit +. 

When f(t) is not assumed bounded, the hypothesis that (6.61) holds for two 
values \; and dz of X for which ,/d2 is irrational does not imply that (6.01) 
holds for each real X. This is established by the following example. Let 
Mm, Ne, Mg,°** be an increasing sequence of positive integers for which 
Npii/Np > © as p— ©. Suppose, to simplify typography, 


d~=2”, ap=4", b,=4", 
Suppose f(t) = 0 except when ¢ is a point of one of the intervals a, < t < b,. 


For each p = 1, 2, 3, --- , let f(x) be defined over the interval a, < t < by 
as follows. For each integer k for which 1 S k S (b, — a,)/2d,, suppose 


f(z) = (—1)'2k/(b, — a), a, + (k — 1)dp < x S a, + kd,>; 

and let f(x) be defined over the remaining half of the interval by the formula 

S(bp — x) = —f(a, + 2), : 0 <2 < }(by — ay). 

For each j = 1, 2, 3, --- , the limit in (6.01) exists and is 0 when \ = 2°’; an 

examination of the graph of f(¢) indicates the manner in which proof proceeds. 

The limit in (6.01) also exists and is 0 when ) is the irrational number u defined by 
p= 2; 


p=1 














~~ cr —* 








ees 





LIMITS OF INTEGRALS 19 


proof of this depends upon the fact that u/d, is only a little greater than an 
even integer when p is large, that is, 

lim (u/d, — 2[u/2d,]) = 0, 

p-2 
where [x] denotes the greatest integer in x. That the conclusion of Theorem 3.1 
fails to hold for this function is an obvious consequence of the fact that, for 
each p = 1, 2, --- , there is an interval of length d, over which the integral of 
f(t) is 1 and another of the same length over which the integral is —1. Hence 
Theorems 2.1 and 3.1 imply that the set of \ for which the limit in (6.01) exists 
must have measure 0. It can in fact be proved that, for this example, the limit 
in (6.01) exists for a given \ if and only if \ can be represented in the form 

h =o + D 62%, 
j=l 

where Ao is an integer, each 6; is 0 or 1, and a sequence m, of integers exists such 


that m, — © and 


Mp? Np [J TN t+ M,, 


for each p = 1, 2,3, ++-. 


CoRNELL UNIVERSITY. 








CLASSIFICATION OF SOLUTIONS AND OF PAIRS OF SOLUTIONS OF 
y’’’ + 2py’+ p'y=0 BY MEANS OF INITIAL CONDITIONS 


By Josepu J. Eacuus 


The following facts concerning real solutions of the real differential equation 
y’”’ + 2py’ + p’'y = 0 are either explicitly stated in or are readily deducible 
from the work of G. D. Birkhoff [Annals of Mathematics, (2), vol. 12(1911), 
pp. 103-127]. There are non-vanishing solutions. If a solution has a double 
zero, it has no simple zeros. If each of two linearly independent solutions has 
double zeros, their zeros interlace along the z-axis. If one of two solutions has 
double zeros while the second has simple zeros, either each zero of the first coin- 
cides with a zero of the second, there being exactly one zero of the second 
between successive coincidences, or the solutions have no zeros in common, 
there being exactly two zeros of the second between successive zeros of the first. 
If each of two linearly independent solutions has simple zeros, either their zeros 
interlace, or they do so in pairs, or alternate zeros of the first coincide with 
alternate zeros of the second. 

It is the purpose of this paper to distinguish between the three types of solu- 
tions, and between the various situations with regard to two solutions, by means 
of numbers determined by the values of the solutions and their first two deriva- 
tives at any given point. In order to achieve this end, it is necessary to demon- 
strate in a manner different from that of Birkhoff that the above statements 
are true. 

The author gratefully acknowledges suggestions made by R. D. Carmichael, 
leading to the preparation of this paper. 

Let p = p(x), on the interval (a, b), be a real continuous function of the real 
variable x, with a continuous derivative. Consider 


(1) y’” + 2py’ + p’'y = 0. 


We postulate that some solution of this equation has at least two double zeros 

on (a, b). It will become apparent in the course of discussion that this is 

equivalent to demanding that some solution have at least three zeros on (a, b) 
Let y; and y; be any two solutions of (1). Then 


yily;” + 2py; + p’ys) + yilyi” + 2pyi + p’ys) 

= yy; + yi yi + 2pyy; + 2pyiy; + 2p'yy; = 0. 
By integration we obtain 
(2) yy) + ys — Yiys + 2pyys = Cis, 


Received June 9, 1941; in revised form, December 30, 1941. Presented to the American 
Mathematical Society, December, 1940. 
1The terms “double zero’ and “simple zero’’ are used in the sense that 2; is a double 
zero of yx: if y:(z1) = y{(z1) = 0, y{/(xz1) ¥ 0, and x is a simple zero of yo if yo(m) = 0, 
y2(z) ¥ 0. 
20 





( 








le 








era 





CLASSIFICATION OF SOLUTIONS 21 


where C;; is the constant of integration. If y; = y;, then 
(3) Qyyi — yi + Qpyi = Cu = Ci. 


Now let ys = yi + Ay;, where A is any constant, and let C; and C; be the ana- 
logues of C; for y; and y. Then 


Cy = 2Ayi + Ayd(ye + Ay) — (ys + AY)? + 2p(ys + AyD” 
Qyiyi — ys + 2pyi + Wy; + yiys — viv; + 2yy,) 
+ N(2yiyi — yj + 2pyj) = Ci + Ci; + NC;. 


II 


Thus we have 
(4) Cy = Ci + Ci; + NC;, 


if yx = yi + Ay;- 
Let us define one more constant, to be associated with a pair of solutions: 


(5) Dj; = Ci; — CL; . 


From (3) it is evident that if y; has a double zero, C; = 0, and if y; has a 
simple zero, C; < 0, and therefore no solution has both double and simple zeros. 
It is likewise evident that y; has no zeros if C; > 0. Under the postulate that 
some solution has two or more double zeros, it is also true that C; > 0 if » 
has no zeros—and in fact if y; has no zeros on an interval whose end points 
are zeros of y2, ye being the postulated solution with two double zeros. For, 
let 2; and x2 be successive (double) zeros of y.. The function y2/y; and its 
derivative are continuous over (x; , 22) and the function vanishes at the end 
points of the interval. There must then be a point x, 71 < 2% < 2, where 
the derivative vanishes, that is, 


yo(Xo)yi(%o) — yr(Xo)yo(%o) = O. 


Write 
— yo(Xo)yr — yr(o)ye 
y2(Xo) 
Then 
Y3(%o) = y3(2o) = 0, and 
(6) yi = Ys et Yo, 


with C; = 0, C3; = 0. It follows from (4) that 


C, = 2y1(2o) i. 
Y2(Xo) 


2 If y; and y; are solutions of (1), then so also is Y = YY} — Yi; (see Birkhoff). It 
may be noted that —D,; is the analogue of C; for Y. 








22 JOSEPH J. EACHUS 


But from (2), 
Cog = yo(x1)ys (21) + ys (ai)ys(21) — yo(21)ys(ar) + 2p(xr)ye(a1)ys(x1) 


yo (21) Ys(21), 


since yo(x1) = y2(a) = 0. 
Also from (6), ys(a1) = y:(v1) and hence 


C, = Qya(xo)ya(xa)ye (a) 
y2(2o) 
Since y is non-vanishing over (21, 22), y:(%) and y;:(x1) have the same sign. 
As yo(x) and Yo (21) must also have the same sign, it follows that C; > 0. 
Since the conditions C; = 0, Ci: < 0, C; > 0, are exhaustive and mutually 
exclusive, as are the conditions y; has double zeros, y; has simple zeros, y; has no 


(7) 


zeros, we may state 
TueoreM I. If y; is a solution of (1), and there is some solution of (1) having 
at least two double zeros, then 


(A) C, = 0, if and only if y, has double zeros; 
(B) C, < 0, if and only if y; has simple zeros; 
(C) Cy > 0, if and only if y: has no zeros. 


Consider now two solutions, each having double zeros. Neither can fail to 
vanish between successive zeros of the other, and hence their zeros alternate 
along the x-axis. If we refer to the zeros of a solution having double zeros as 
a “set’’, any two sets interlace. 

Suppose now that y; is some solution having simple zeros, and we look for 
solutions y; such that C; = 0, Ci; = 0. This is most easily attacked by ex- 
amining (2) and (3) at some point 2; which is a zero of y,. Here we require that 


(8) ys(ar)yi (a1) — yi(ar)yi(ar) = 0 
and 
(8’) Qy(ar)yi (a1) — (yi(ar))? + 2p(ai)(yi(ar))” = 0. 


There are two linearly independent solutions of (1) which satisfy (8) and (8’), 
one defined by 
y(n) = yim) = 0, yi(m) = 1, 
the other by 
, 1 (a 1/yi(a)\ 
yer) =1, lr) = BS ya) = (4 0) — p(x). 
yi (21) yi (21) 

Designate these by yz and y; respectively. By considering (8) and (8’) as an 
algebraic system, it is easily seen that every solution which satisfies (8) and (8’) 
is linearly dependent on y2 or on y;. Since C2 = 0, C; = 0, Ce = 0, Cs = 0, 














CLASSIFICATION OF SOLUTIONS 23 


we see from (2) that every (double) zero of yz and of y; is a zero of y,. Now 
every zero of y; is either a zero of ye or a zero of ys, for, let x2 be any zero of y 
and let ys be defined by ys(a2) = ys(x2) = 0, ys (a2) = 1. Then Cy = 0 and 
Cy, = 0, and y% is linearly dependent on y2 or on y3 , whence either yo(z2) = 0 
or y3(%2) = 0. Thus, if a solution y has simple zeros, those zeros compose two 
sets, and 

TueorEeM II. If C, < 0 and C. = 0, then 

(A) of and only if Cy = 0, every zero of y2 is a zero of y; , and alternate zeros 
of y: are zeros of Ye; 

(B) if and only if Cy # 0, successive zeros of y2 are separated by exactly two 
zeros of y: . 

The zeros of y; compose two sets if C; < 0, and in fact may be any two sets. 
For, let x; be a point of one set and 22 a point of another. Define y , y2 and y; by 


yi(r1) = 0, yi (21) = |, yi (21) = 0, 
yo(ti) = yo(ar) = 0, yo (a1) = ], Ys = Yo(%2)yi — YrlXe)ye « 


Then y3(%1) = ys(ae2) = 0, and ys 4 0. If ys(a1) = ys(xe) = O, y% is readily 
seen to be linearly dependent on y; . 

Since the zeros of any two sets interlace, there are only three possible configura- 
tions of the zeros of y; and y if C; < 0, C2 < 0, and y and ye are linearly inde- 
pendent. Either the zeros of y; and those of y2: separate one another, or they 
do so in pairs, or alternate zeros of y; coincide with alternate zeros of ye . 

Now let yz = y: + Aye. The three cases may be distinguished by the number 
of values of \ for which C; is zero, C; = C, + 2ACi2 + A°C2 (see (4)). In the 
first case—separation of the zeros singly—there are evidently no such values 
of X. If there were, ys for that \ would be always of one sign, but y , and 
hence y; , cannot have the same sign at successive zeros of y; , as there is exactly 
one intervening (simple) zero of y.. It follows that in this case, Dy = 
Ciz — C,C2 is negative. In the second case—separation of zeros by pairs— 
there are two values of \ for which C; = 0. The argument used in establishing 
Theorem I shows that if one solution does not vanish on an interval and another 
has zeros at the ends of the interval, there is then a linear combination of the 
two which has a double zero within the interval. In the present case there 
must be a \ such that the corresponding y; has a double zero between two non- 
separated zeros of y: , and a A such that the corresponding y; has a double zero 
between two non-separated zeros of yz. These values of \ may not be the 
same, as the zeros of the respective y;’s are not members of the same set, in 
view of the interlacing of sets. We have, then, that D,» is positive in this case. 
Finally, if alternate zeros of y; coincide with alternate zeros of y2 , Dy is zero, 
and there is one and only one A for which C; = 0. For, let 2; be a coincident 
zero. From (2) and (3), 


C, = —(y;(21))’, C, = —(ys(x1))’, Cy = — ys (a1) yo(a1) 











24 JOSEPH J. EACHUS 


and 
Cs = —[(ys(ar))? + 2dyr(ai)ys(ar) + A°(y2(a1))") 
= —(yi(a1) + Aya(m))’. 
As y2(21) ~ 0, the statement is established. 
Since Dy». must be positive, negative or zero, we have 
TueoreM III. Jf Ci: < Oand C, — 0, then 
(A) if and only if Di < 0, the zeros of y; and those of y2 separate one another; 
(B) if and only if Dy > 0, the zeros of y; and those of ys separate one another 


by pairs; 
(C) if and only if Dz = 0, alternate zeros of y, coincide with alternate zeros of ye. 


PurpDUE UNIVERSITY. 





Qo ~~ - — ee 











STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 
By WARREN AMBROSE AND SHIZUO KAKUTANI 


1. Introduction. The purpose of this paper is to establish some regularity 
properties for flows which are assumed to satisfy only measurabilit,y conditions. 
In particular, we are concerned with establishing conditions under which a flow 
will be isomorphic to a continuous flow or a flow built under a fuction. For 
measure spaces which satisfy the two conditions of being properly separable and 
having a separating sequence of measurable sets (see Definitions 8 and 10), 
our results can be summed up by saying that every fiow different from the 
identity is isomorphic to a (generalized) flow built under a function and to a 
continuous flow on a separable metric space with a regular measure; these results 
are obtained in Theorems 2, 4 and 5. For measure spaces which do not satisfy 
these conditions, the situation is more complicated, and we refer the reader to 
the body of the paper. Flows built under a function were first introduced, in a 
special case, by J. von Neumann’ and have since been considered by one of the 
present authors.’ The significance of this isomorphism of any flow to such a 
flow is that it gives a kind of normal form for a flow and it makes possible the 
taking of cross sections and the reduction of various properties of a flow to 
properties of a single measure preserving transformation on such a cross section. 


2. Definitions and notation. 

DEFINITION 1. A measure space 2({8, m) is a system of a space Q, a Borel 
field‘ & of subsets M of Q, and a countably additive measure’ m(M) defined on B 
and satisfying the following conditions: 

(2.1) QeBand m(Q) < +, 
(2.2) there exists an M ¢e&% for which 0 < m(M) < m(Q), 


Received June 13, 1941. 

1 All measures usually considered, and in particular Lebesgue measures in Euclidean 
spaces, satisfy these conditions. 

2 J. von Neumann [4], pp. 636-641. 

3’ W. Ambrose [1]. 

* A collection of subsets of a space is called a field if it is closed under the operations of 
finite addition, finite intersection and complementation. It is a Borel field if it is a field and 
is closed under the operations of countable addition and intersection. It is easy to see 
that for any collection of subsets of a space there exists a smallest field, and also a smallest 
Borel field, which contains the given collection; these are called respectively the field 
determined by the collection and the Borel field determined by the collection. 

5 A countably additive measure is a non-negative set function m(M) defined for all sets 

i) 2) 


M in some Borel field & and having the property that m{ > M, ) = > m(M,) for any 


n=l n=l 


sequence {M,} (n = i, 2, ---) of disjoint sets from B. 
25 








26 WARREN AMBROSE AND SHIZUO KAKUTANI 


(2.3) m(M) is completed,’ i.e., if M «Band m(M) = 0, then every subset of M 
is also in $8. 

Throughout the present paper we shall use the symbol Q, with various ap- 
pendages (e.g., 2’, &, 2*) for a measure space. The corresponding Borel field 
and measure, as well as points and sets in the space, will always be accompanied 
by the same appendages. We shall use the symbols A, M and N for sets in a 
measure space, and w and v for points in such a space. Sets in & will be called 
measurable and any set of measure zero will be called a null set. If the sym- 
metric difference’ M © N of two sets M and N is a null set, then we shall say 
that M and N are equivalent and write M ~ N. A real valued function f(w) 
defined on a measure space is measurable if the set* [w: f(w) > a] is measurable 
for any real number a. 

DEFINITION 2. A measure preserving transformation is a one-to-one mapping 
T of a measure space 2({8, m) onto a measure space 2’({8’, m’) with the property 
that M ¢& if and only if 7(M) ¢%’, and further that m(M) = m’(T(M)) for 
any measurable set M ¢«®. Usually Q(%, m) and 0’({8’, m’) will be the same 
space. In this case, a set M €$8 is called invariant under T if w « M implies 
that both T(w) « M and T"'(w) «M. A measure preserving transformation 7 
(of a measure space onto itself) is ergodic if there exist no measurable sets in- 
variant under 7 except null sets and complements of null sets. 

DeFINiITION 3. A flow is a one-parameter group {7;} (—*2 <t < +, 
T.(T (w)) = T.+:(w) for all s, t, w) of measure preserving transformations 7, of a 
measure space onto itself. If {7,} is a flow on a measure space 2 and if w is a 
point of 2, then we shall usually denote T,(w) by a. For a fixed w, the set 
of all w, (—*» < t < +) is called the trajectory through w. A set is called 
invariant under a flow if it is invariant under each member of the flow, or 
equivalently, if whenever it contains a point w it contains the entire trajectory 
through w. A flow is ergodic if there exist no invariant measurable sets except 
null sets and complements of null sets. 


Derinition 4. A flow {7,} on a measure space Q is measurable if for any 
measurable set M in Q the (w, t)-set M* defined by M* = [(w, t): w: € M] is 
measurable in the product space 2* of 2 with the real t-axis, where the measure 
(and measurability) on Q* is defined multiplicatively in terms of the given 
measure on 2 and the Lebesgue measure on the ¢-axis.” 


Derrtnition 5. Let {S,} and {7,} be two flows on measure spaces 2’ and 
2” respectively. {S,} and {7} are isomorphic if it is possible to split 2’ and 


6 It is not always assumed that a measure has this property, but it is possible to extend 
the domain of definition of any countably additive measure (i.e., to enlarge the Borel field 
SB on which it is defined) to obtain a completed measure. 

7 The symmetric difference of two sets M and N is defined by MQ N=M+N—-—M-N = 
(M — MN) + (N — MN). 

8 We use the symbol [w; C] for the w-set on which the condition C is satisfied. 

® For a definition of measure in product space, see S. Saks [5]. 





of _= or 








STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 27 


2” into two parts N’, 2’ — N’ and N”, 2” — N” respectively, in such a way that 
(2.4) N’ is a null set invariant under {S,} and N” is a null set invariant 
under {7}, 


(2.5) there exists a measure preserving transformation of 2’ — WN’ onto 
Q”’ — N” which carries { S;} into {7}, i.e., such that if w’ «Q2’ — N’ corresponds 
to w” « 2" — N” then S,(w’) corresponds to T;(w’’) for all ¢. 


DEFINITION 6. Let 2({8, m) be a measure space, and let S be a measure pre- 
serving transformation of 2 onto itself. Let f(w) be a positive real valued 
function defined on 2 which is measurable and integrable on 2. We assume 


that >> f(S"(w)) = + and }> f(S""(w)) = + for any w¢2. (This condi- 


n=0 n=1 

tion is surely satisfied if there exists a constant c > 0 such that f(w) > c for 
all we.) Consider the product space of 2 with the real u-axis, with the 
measure ™ defined on it multiplicatively in terms of m-measure on @ and the 
Lebesgue measure on the u-axis. Let 2 be the portion of this product space” 
under the graph of f(w), i.e., the set of all points 2 = (w, u) for which 0 S u < 
f(w), and let & be the collection of all m-measurable subsets of 2. Then 
(QR, m) is a measure space. Define the flow {7,} on 2 by 


T{w,u) = (out, if —-ust< —ut+ flo), 
= (S"(w), u +t — flw) — --- — f(S"“(w))), 


n—\| n 
if —u+ Dd f(S*()) St < —ut+ DY f(S*(e)), 


(2.6) n= i 3 tee, 
(S"(w), u +t +f(S"(w)) + --- + f(S"(@))), 


t -s- PAW) 6+ es > (Se), 


k=1 


m= 1,2, °°. 


We call {7,} the flow built under the function f(w) on the measure preserving 
transformation S, or simply, a flow built under a function. Q is called the base 
space, S is a base transformation and f(w) is calied a ceiling function. 

This definition was given in [1]. In the following discussions, it is neces- 
sary to consider a general case in which the base space 2 has an infinite 
measure. In this case we call {7} a generalized flow built under a function. It 
is, however, to be noticed that we always assume that © has a finite measure: 


m(Q) = | f(w) dm(w) < +. Hence it is impossible, in this case, that there 
Q 


exist a constant c > 0 such that f(w) > c for all w 2." 


1 Since we assumed that f(w) is integrable on ©, @ has also a finite measure: m(Q) = 
Sgf@) dm(w) < @. 

11 It is assumed that @ is a sum of a countable number of subsets of a finite measure. 
More precisely, we shall consider only the following cases: Let { (T } (n = 1,2,--+) bea 





28 WARREN AMBROSE AND SHIZUO KAKUTANI 


As in [1] we shall have to consider the functions F(@) and G(@) associated 
with a (generalized) flow built under a function, and defined by 


(2.7) F(a) = F(w, u) = fl), 
(2.8) G(a) = G(w, u) = u. 


These are both m-measurable functions defined on 2. 


3. Properties of measurable flows. We begin with two lemmas concerning 
measurable flows. 

Lemma 1. Let {T,} be a measurable flow on a measure space 2(, m). Then 
(3.1) for any measurable function g(w) defined on Q, there exists an invariant 
null set N «SB such that the t-function g(w:) is Lebesgue measurable for any 
weQ — N; 

(3.2) for any null set M ¢& there exists an invariant null set N €B such that 
the t-set [t: w, ¢ M] is of Lebesgue measure zero for any weQ — N. 

Proof. These are immediate consequences of the definition of a measurable 
flow and Fubini’s theorem. 

LemMA 2. Let {7T.} be a measurable flow on a measure space 2({8, m), and 
let M «B&B be a measurable set such that T,(M) ~ M for every t. Then there exists 
an invariant measurable set N ¢B such that N ~ M. 

Proof. See E. Hopf [3], p. 27. 

Derinition 7. A flow {7;} on a measure space 2(, m) is proper if every 
measurable set of positive measure contains a measurable set M ¢& such that 
m((Q — M)-T,,(M)) > Ofor some t ; it is completely improper if M ~ T.(M) for 
any measurable set M ¢&% and for any t. It is a consequence of Lemma 2 that 
a flow is completely improper if and only if every measurable set is equivalent 
to an invariant set. 

THeoreM 1. Let {7,} be a measurable flow on a measure space 2. Then 
Q = Q + &, where Q; and Q are disjoint invariant measurable sets, and {T,} 
is completely improper on Q, and proper on Q2 . 





sequence of flows built under a function, each defined on a measure space 2.(B, m). We 
denote the base space of 2, by 2.({B, m). The base transformation and the ceiling func- 
tion on 2, are denoted by S and f(w) respectively (without suffix n). We assume that 


oo 2 t=) 
> m(Q.) < + ~, but we do not assume that 5 m(Q,) < + ©. Letus puta = 2 a 
=i n=l n=l 
7 oo _~ = 
and2= > Q, (ineach case, summands are assumed to be disjoint). 2(&, m) is a measure 
i= 
i) 


space, while this is not always true for 2(B, m) since m(2) = LY m(Q,) may be infinite. 
n=l 


Then, by putting together these flows {Te }} (n » i, 2, ---), we shall have a flow {7+} 
defined on Q; indeed, {7;} is defined by 7;(6) = T:" (©) ifGeQ. {[Tihisa generalized flow 
built under the function f(w) on the transformation S. This differs from the ordinary flow 


built under a function only in that we do not assume the finiteness of m(Q) = > m(Q,). 
n=l 





od 


oer”Tye ™~ 


“w 








STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 29 


Proof. If {7;} is not proper on ©, then there exists a measurable set M of 
positive measure such that every measurable subset of M is equivalent to an 
invariant set. By Lemma 2, we may assume that M itself is invariant. Let & 
be the collection of all such sets M, and let a be the least upper bound of m(M) 
for all Me. Then there exists a sequence {M,,} (n = 1, 2, --- ) of measurable 
sets from % for which m(M,) ~ a. Putting Q = > M, and® = 2 —Q, 


n=l 
it is clear that we have the required decomposition. 

This theorem is useful because it allows us to consider separately the com- 
pletely improper flows and the proper flows, and to forget about the inter- 
mediate case of flows which are neither proper nor completely improper. We 
sha]l consider these two cases in §§4 and 5 respectively. 


4. Structure of completely improper flows. Obviously, the identity flow 
(defined by w; = w for all w and ¢) is completely improper, and the question 
arises whether there are any other completely improper flows. That there are 
others is shown by the following example. Consider a flow built under some 
function on the identity transformation (the particular function and the par- 
ticular space on which the identity transformation is defined do not matter); 
but, instead of taking for our measurable sets the usual collection of all 7-meas- 
surable sets, we take a subcollection of those #i-measurable sets which are equiv- 
alent to a set depending on w alone.” Then it is clear that this is a completely 
improper flow, while there is no point which is invariant under the flow. 

In this example of a completely improper flow, we first took a proper flow 
(it will be shown in Theorem 4 that every flow built under a function is proper) 
and then decreased the collection of measurable sets to obtain a completely 
improper flow. This fact leads to the suspicion that when a flow is improper 
it may not be through any fault of the flow itself but rather may be due to 
some deficiency in the collection of measurable sets of the measure space on 
which it is defined. Theorem 3 below confirms this suspicion to a certain ex- 
tent by determining the general form of completely improper flows, and Theorem 
2 will show that in all measure spaces usually considered the only completely 
improper measurable flow is the identity. 


Derinition 8. A countable collection {M,} (n = 1, 2, ---) of subsets of 2 
is called a separating sequence, if for every pair of points w and v of 2 (w ¥ v) 
there exists an M,, which contains one but not both of w and v. 


TuHeoreM 2. Let {T;} be a completely improper measurable flow defined on a 
measure space 2(, m). If Q has a separating sequence of measurable sets, then 
there exists an invariant null set N e& such that {T,} is the identity on Q — N. 


Proof. Let {M,} (n = 1, 2,---) be a separating sequence of measurable 
sets in 2. Since {7;} is completely improper, there exists a sequence of in- 


12 A set M C @ depends on w alone if (w, u) « M implies (w, v) « M for all v with 0 S » 
< f(w); such a set M is necessarily of the form (M X (— ©, +))-2; we say that M is deter- 
mined by M. 





30 WARREN AMBROSE™ AND SHIZUO KAKUTANI 


variant measurable sets { A,} (n 


We define a null set M by M 


1, 2, ---) such that A, ~ M, (n = 1, 2, ---). 
> (M. 6 As) = D (M, + An — My-As)- 


n=l n=l 

Then, by Lemma 1, there exists an invariant null set N eB such that for 
weQ — N the t-set M(w) = [t: w, € M] is of Lebesgue measure zero. We shall 
prove that this N is a required set, i.e., that w, = w for any we — N and for 
any t. First we prove that, for weQ2 — N, te M(w) and t e’ M(w) imply 
we = wi. Indeed, if we have w,, ¥ ow, for such 4; and &, then there exists 
an M, which contains one but not both of w,, and w,,. Since w;, and w,, both 
do not belong to M,, © A,,, the same thing must be true of A,. This is, however, 
a contradiction to the fact that A, is invariant under {7;}. Thus we have 
proved that, for every weQ — N, t, e’ M(w) and te’ M(w) imply w:, = ws, . 
The remainder of the proof of Theorem 2 clearly follows from the following 
lemma. 


LemMaA 3. Let {7} be a flow on a measure space. If, for every fixed w €Q, 
the t-set E defined by E = [t: w: = w] is Lebesgue measurable and of positive 
measure, then w: = w for all t. 


Proof. The group property of the flow clearly implies that Z is an additive 
group of real numbers. The lemma then follows from the known theorem that 
the only additive group of real numbers which is Lebesgue measurable and of 
positive measure is the whole real line. 

DerFINitTion 9. Consider a measure space 2(%, m) and a real valued non- 
negative function f(w) defined on it. We do not assume that f(w) is measurable. 
Moreover, f(w) may be equal to 0 or +. For every point w e2 consider the 
set 2(w) of real numbers defined as follows: if f(w) = 0, then 2(w) consists only 
of a single point 0; if 0 < f(w) < +, then 2(w) is a semi-open interval: 0 < u < 
f(w); if fw) = +, then Q(w) is an infinite interval: —2 <u < +2. Let 
© be the set of all pairs (w, u), where w €Q and u €Q(w). 


We shall make © into a measure space. Let No be the collection of all sets 
N c Q& for which there exists a null set N «B (N C Q) such that weQ — N 
implies mo(N(w)) = 0, where N(w) is the set of all u €&(w) such that & = (w,u) e N 
and mp is a measure defined on Q(w) in the following way: if f(w) = 0, then 
mo({O}) = 1 ({O} is a set consisting of 0 alone); if 0 < fiw) S +, then m 
is the ordinary Lebesgue measure. Clearly Jt) has the following properties: 


(4.1) NieNo (n = 1,2,---) imply >> Nie, 
n=l 
(4.2) No € Ny ’ N a No imply N € No P 


Now let % be an arbitrary subcollection of Jt) which has the same properties 
(4.1) and (4.2). Let Bo be the collection of all sets M C & which depend on w 
alone and are determined by a set M eB (M C Q). Then the collection 8 
is defined as follows: a set A C & belongs to & if and only if there exists a set 
M «<o such that the symmetric difference A @ M belongs toN. & is clearly a 











Mn 


YY of SS 








STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 31 


Borel field and if we put m(A) = m(M), where M ¢ & is the set in 2 which de- 
termined M ¢ ®y, then m(A) is a countably additive measure which is defined 
and completed on &. We define a flow {7;} on the measure space 2(B, m) 
as follows: if f(w) = 0, then 7;(w, 0) = (w, 0) for all ¢ (i.e., {7.} is the identity); 
if 0 < f(w) < +, then 7,(w, u) = (w, u + #), where the second coordinate is 
to be taken modulo f(w) (i.e., {7} is isomorphic to a rotation of a circum- 
ference); if fw) = +, then T,(w, u) = (w, u + 2) for all ¢t (ie., {7} is iso- 
morphic to a translation on an infinite line). It is clear that the flow {7;} 
thus defined is a completely improper flow. Such a flow is called a singular flow. 


THEOREM 3. Every completely improper measurable flow is isomorphic to a 
singular flow. 

Proof. Let {S,} be a completely improper measurable flow defined on a 
measure space 2*(8*, m*). From each trajectory of {S,} pick up a single point 
w*, and let 2 be the set of all such points. When we consider w* as a point of 
2, we denote it by w. We shall define a measure on. A collection {8 of meas- 
urable sets in 2 is defined as follows: a subset M of 2 belongs to & if and only 
if the corresponding set M* = > 0*(w) belongs to &*, where we denote by 


weM 
Q*(w) the trajectory through a point w* which corresponds to w. §& is clearly 
a Borel field, and if we put m(M) = m*(M*) for every M ¢&, then m(M) is a 
countably additive measure which is defined and completed on &%. Thus we 
have defined a measure space (8, m). 

Now we shall define a function f(w) on 2 as follows: if 2*(w) consists of w* 
alone (i.e., if w* is invariant under {S,}), then f(w) = 0; if {S,} is periodic on 
Q*(w), then f(w) = period of {S,} on Q*(w); if {8,} has no period on 2*(w) 
(i.e., if S;,(w*) ¥ S,,(w*) for any 4 and & (4 ¥ t)), then fw) = +o. We 
construct the measure space 2(@, m) in terms of the measure space Q(B, m) 
and the function f(w) thus defined as in Definition 9. (Here it must be noticed 
that there was a certain arbitrariness in the choice of a subcollection N of No.) 
We shall now show that, if we take a suitable subcollection N of 3%. , then the 
singular flow {7',} thus obtained is isomorphic to the given flow {S,}. In order 
to prove this, consider the correspondence S;(w*) <> (w, ¢), where w* €Q* is a point 
chosen from each trajectory which corresponds to a point w e«Q2. It is clear that 
the flows {S,} and {7} are carried over into each other by this correspondence, 
and it only remains to show that the Borel fields ®* and §& are also carried over 
into each other by this correspondence. This is, however, a consequence of 
the following two facts: (1) by the definition of a completely improper flow, 
every measurable set A* e &* is equivalent to an invariant set M* e B* and the 
latter corresponds to a set M ¢ ®o which depends on w alone; and (2) by Lemma 
1, every set N* ¢ &* of measure zero corresponds to a set N ¢ No, where No is a 
collection defined in Definition 9, and the collection of all such sets N (which 
correspond to some null set N* ¢ *) has the properties (4.1) and (4.2). Hence 


we have only to take % as the collection of all such sets N. This proves 
Theorem 3. 








32 WARREN AMBROSE AND SHIZUO KAKUTANI 


5. Structure of proper flows; the fundamental representation theorem. In 
this section we prove that a measurable flow is isomorphic to a generalized flow 
built under a function if and only if it is proper. This theorem was proved 
previously for ergodic flows by one of us [1]. In order to prove it for non- 
ergodic case, we need Lemma 4 below, which is essentially due to Poincaré.” 
We remark that the proof for the ergodic case was obtained by using a certain 
recurrence property that follows from ergodicity, whereas this lemma gives us a 
usable recurrence without any such assumption as ergodicity. 


Lemma 4. Let {7} be a flow on a measure space Q, and let A be any measurable 
set in Q. Then there exists a subset N of Q of measure zero such that for 
any we A — N the trajectory through w intersects A infinitely often both as t —~ + 
and ast— —, 

Proof. This lemma follows trivially by applying Hilfssatz 13.3, p. 48 of 
E. Hopf [3], to any member, other than 7), of the flow {7}. 


THEOREM 4. A measurable flow is isomorphic to a generalized flow built under 
a function if and only if it is proper." 


Proof. First we prove that a generalized flow built under a function (and 
hence any flow isomorphic to a generalized flow built under a function) is proper. 
Let {7,} be a generalized flow built under a function (we use the notation of 
Definition 6) and let A be any measurable set of positive measure. Then choose 
real numbers a and b with 0 < b — a < asuch that the set M defined by 

M = [a:a < Gla) < b)-A 
has a positive measure (that this can be done is a consequence of the fact that 
the measure 7 on the (w, u)-space is defined multiplicatively in terms of the 
measure m of @ and the Lebesgue measure on the u-axis). Then M is a meas- 
urable subset of A with the property that m((@ — M)-T_.(M)) > 0. This 
proves that {7} is proper. 

Now we prove that, conversely, every proper measurable flow is isomorphic 
to a generalized flow built under a function. Let {S,} be a proper measurable 
flow on a measure space 2*({B*, m*). We shall find a decomposition of 2* into 
a sum of mutually disjoint invariant measurable sets {2,} (n= 1,2,---) on 
each of which the flow {S;} is isomorphic to a flow built under a function. The 
number of these parts may be either finite or countably infinite, and there may 
also be a remaining part of measure zero. Putting together these flows built 
under a function, we finally have a generalized flow built under a function which 
is isomorphic to {S;}. 

Consider the collection %* of all measurable sets M* «&* of the form: 
M* = (Q* — A*)-S,(A*), where A* ¢ B* and tis any real number. Let 6, = 6(Q*) 
be the upper bound of the measures m*(M*) for all M* «YA*. Since {S;} is 
proper, 6, = 6(Q2*) must be positive. We shall now show that there exists an 


13 See E. Hopf [3], p. 48. 
1 Theorem 4 includes Theorem 2 of [4] as a special case since, as is easily seen, every 
ergodic flow is proper. 





a a po pile 














STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 33 


invariant measurable set 27 C 2* such that m*(Qy) > #46, and on which the 
flow {S,} is isomorphic to a flow built under a function. 

For this purpose, let M* ¢ 9* be a measurable set with m*(M*) > 46, , and let 
A* be a measurable set such that M* = (Q* — A*)-S,,(A*), where t is a suitable 
real number. We shall denote the characteristic function of A* by ¢(o*). 
Then, by Lemma 1, there exists an invariant set N* of measure zero such that 


~ ‘ Gs % 
for any w* «Q* — N* the ¢-function ¢(w, ) is Lebesgue measurable. For any 
w* «Q* — N* we define a function ®,.(w*) by 
_ . eo es * 
(5.1) $(w*) = — g(w, ) dt, O<¢8 i, 
c Jo 


. * ~ 
and we define the sets A; and A, by 


(5.2) Al = [w*:&(w*) <3], Aa = [w*: &(w*) > 4). 


, . ° 7 - 15 
(We shall choose a fixed value of cin a moment.) By a theorem of N. Wiener, 


we know that ®,(w*) — ¢(w*) almost everywhere asc—>0. Hence we can choose a 
real number c with 0 < ¢ S 1 in such a way that m*(At © (Q* — A*)) < 
1m*(M*) and m*(Az © A*) < 4m*(M*). From these follow easily that 
m*(A;-M*) > 3m*(M*) and m*(S,.(Az)-M*) > %m*(M*), and hence that 
m*(Az - Si,(A)) > 4m*(M*). We now define QT to be the set of all points 
w* eQ* whose trajectories intersect A} -S,,(A2) infinitely often both as t > + 
and ast > —«. Obviously Q} is an invariant set, and it is formally given by 


(5.3) of = (I] } Sat-s,,(az))) (I > S(ar-S,,(Az))), 
s t>s s t<s 


where s and ¢ in these summations and products run through all real numbers. 
Using the fact that ,(w;) is a continuous ¢-function for each w* eQ* — N*, 
it is easy to see that the expression on the right hand side of (5.3) is the same 
whether s and ¢ run through all real numbers or only through rational numbers. 
Hence 2; is m*-measurable. It is clear that m*(Q7) = m*(Ar - Si,(Az)) > 
tm*(M*) > 3h. 

Next we shall prove that, on this part 27, the flow {S:} is isomorphic to a 
flow built under a function. To do this we must find a measure space 2) , a 
measure preserving transformation S on 9, and a positive valued function 
f(w) defined on 2, such that the flow built under a function in terms of these 
is isomorphic to the given flow {S,} on QF. First we notice that for every 
w* €Q) we have (from the definition (5.1) of ®.(w*)) that 


bo 


(5.4) (wr) — (ws) | ~lt—8|. 


Now we define 9; by 


(5.5) &%=aF E (w*)=3 and (w,)>3 foralltin O<ts s|- 


1 N. Wiener [6], p. 2, Theorem III’. 








34 WARREN AMBROSE AND SHIZUO KAKUTANI 


Since the trajectory of each w* ¢Q} intersects At -S,,(Az ) infinitely often both 
as t— +o and ast — — ~, it will have the same property for each of the sets 
Ay and A? . Then the fact that ®,.(w*) < } for w* € A} while $,(w*) > 3 for 
w*e A? , together with the continuity of $,(w;) along each trajectory (as a 
t-function) and the inequality (5.4) imply, as in [1], that each trajectory through 
a point of QF intersects 2 infinitely often both as t > + and ast > —. 
Having established this, we can now define S and f(w) as follows: for any 
we, there is a smallest positive number ¢ for which S;,(w) «Q,;. We define 
S(w) to be S,(w) and f(w) to be t. (We notice that f(w) is bounded below by 
a positive constant.) 

Now we need to define a measure on Q; for which S is a measure preserving 
transformation and f(w) is a measurable function, and then to show that the 
flow built under a function {7} which we obtain from these (as in Definition 
6) is isomorphic to the given flow {S;} on .. According to Definition 6, 
{T.} is defined on a measure space %(G,m) whose points are of the form 
@® = (w,u),0 S u < f(Ww),weQ,. Itis clear that the mapping (w, u) @ S,(w) 
gives a one-to-one correspondence between 2; and or, and that this corres- 
pondence carries {7',} into {S,}. Now let m(M) be the measure on &, which is 
carried over from the m*-measure on Qf by this correspondence. Then all we 
need to show is that this measure 7 is the product measure of a certain measure 
on 2, with the Lebesgue measure on the u-axis, and where the measure on Q; 
is such that S is a measure preserving transformation and f(w) is a measurable 
function. To prove this it is sufficient, by Theorem 1 of [1], to show that the 
functions F(a) and G(@) defined by (2.6) are 7i-measurable. Since this can be 
proved exactly in the same way as in the case of ergodic flows, we omit the de- 
tails and refer the reader to [1], pp. 734-735. 

We have thus proved the existence of an invariant measurable subset Qf of 
Q*, whose measure satisfies m*(Q7) > 26, = 36(2*) and on which the flow {S;} 
is isomorphic to a flow built under a function.”® If Qf = Q*, then our proof is 
complete; if m*(Q* — oF) > 0, then we consider the flow {S,} on the remaining 
invariant set 2* — QF , on which it is again a proper measurable flow. Then, 
by the same argument as in above, we can find an invariant measurable subset 
Q2 of 2* — OF such that m*(Q2 ) > 46. and on which the flow {S;} is isomorphic 
to a flow built under a function, where 6. = 6(Q* — OF) is defined for 2* — OF 
exactly as 6; = 6(Q*) was defined for 2*. Repeating this procedure, if necessary, 
’ we obtain at the n-th stage n invariant disjoint measurable subsets oF awe. 
2% of 2*, which are mutually disjoint and on each of which the flow {S;} is 
isomorphic to a flow built under a function. Moreover, the condition m*(Q;) > 


46, = 48(Q* — OF — --- — Of.) is satisfied fork = 1,2,---,n. If m*(Q* — 
Qf — --- — Q*) = O for some n, then our proof is complete; if, on the contrary, 
m*(Q* — OF — --- — 2%) > O for each n, then we have a sequence of invariant 
measurable subsets {Q%} (n = 1, 2, ---) of Q*, which are mutually disjoint 


16 By appealing to transfinite induction we could eliminate the remainder of the proof. 











ere wr’ we 


crTr ~w 





STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 35 


and on each of which the flow { S,} is isomorphic to a flow built under a function. 


Moreover, the condition m*(Q%) > 38, = 36(Q* — OF — --- — 0*_,) is satis- 
fied forn = 1,2, ---. Weshall now prove that m*(Q* — >> 0%) = 0. Indeed, 
n=l 


i) 
° * *. o,° ° ° 
if 25 = a* — >> 2% is of positive measure, then the flow {S;} is again proper 


n 
* . + . ~ 
on 2, and the corresponding real number 6,, = 4(0,,) (defined for Q,, exactly as 
5, = 6(Q*) was defined for 2*) must be positive. This is, however, a contradic- 
" . * * ‘ 
tion since from 2,, C 2* — Q; — --- — Q,_1 follows 6, < 6, forn = 1, 2,---, 


and hence nd, < >. & < 6 >, m*(Qt) < 6m*(2*), so that 5, must be zero. 
k=1 k=1 


Consequently we have m*(Q5) = 0, and this completes the proof of Theorem 4. 

Remark. It is not always true that a proper measurable flow is isomorphic 
to a flow built under a function. Consider, for example, a generalized flow 
built under a function {7',} defined as follows: base measure space © is the set 
of all positive integers (n = 1, 2, ---) on which a measure m is defined for every 
subset and is equal to the number of points (hence m(Q2) = +); base trans- 
formation S is the identity; and the ceiling function is defined by f(n) = 2°” 
It is clear that {7} is not isomorphic to a flow built under a function. 


6. Isomorphism to continuous flows. In this section we prove that every 
measurable flow on a certain kind of measure space—and this class includes 
most of the measure spaces usually considered—is isomorphic to a continuous 
flow on a separable metric space, on which the measure and topology are nicely 
related. When a measurable flow is given, it is trivial to find a metric with 
respect to which the flow is continuous, but this metric may not be related with 
the measure in any way.’ The point of this isomorphism theorem is to show 
that it can be done with the measure and the topology properly related. 

We begin with some definitions and lemmas. 


Derinition 10. A measure space 2(%, m) is properly separable,” if there 
exists a countable collection % of measurable sets such that the Borel field de- 
termined by %, when completed with respect to the measure m, is exactly the 
collection & of all measurable sets in 2. Y is called a basis of Q(B, m). 

A countable collection % of measurable sets is clearly a basis of a measure 
space 2({8, m) if it satisfies the following condition: 

(6.1) For any measurable set M ¢& and for any e > 0, there exists a sequence 
. io} 
of measurable sets {M,} (M,¢«U, n = 1, 2,---) such that MC > M,, and 


n=l 


m (= M .) < m(M)+ .. Moreoyer, a necessary and sufficient condition that 


n=l 


17 For example, we might define a distance by d(w, v) = 1 ifw and v are on different tra- 
jectories, and d(w, v) = min (1, |t|) if and v are on the same trajectory and a; = v. 

18 The conditions of proper separability and the existence of a separating sequence are 
logically independent. 











36 WARREN AMBROSE AND SHIZUO KAKUTANI 


a measure space 2(, m) be properly separable is that there exist a countable 
basis which satisfies the following stronger condition: 


(6.2) For any measurable set M ¢% and for any e > 0, there exists a sequence 


of measurable sets {M,} (M, ¢ U%, n = 1, 2, ---) such that MC >> M, 


n=l 


and Zz. m(M,) < m(M) + e. 


n=l 
If % is a basis of a measure space 2({B, m), then the field determined by % clearly 
satisfies the condition (6.2). 

Lemma 5. Let 2*(B*, m*) be the product measure space of two measure spaces 
Q(B, m) and 2'(B’, m’). Then 2*(B*, m*) is properly separable if and only if 
both Q(B, m) and Q'(B’, m’) are properly separable. 

Proof. We shall first prove that if Q(B, m) and Q’(8’, m’) are both properly 
separable, then so is 2*(B*, m*). Let AW and YW’ be the countable bases of 
2(B, m) and 2’(%’, m’) respectively which have the property (6.2). Let us 
consider the collection %* of all sets M* of the form: M* = M X M’, M eX, 
M’ eX’. Then %* is also countable, and all we have to prove is that %{* has 
the property (6.2). In order to prove this, let M* ¢B* and let « > 0 be an 
arbitrary positive number. By the definition of the product measure space, 


there exist two sequences of measurable sets {M,} (M,¢€B, n = 1, 2, ---) 
and {M/)} (Mi «8, n = 1, 2, --+) such that M* Cc > % M, X M’. and 
n=l 


p m(M,,)m’(M’.) < m*(M*) + + Since % and Y%’ both have the property 
n=l 


(6.2), there exist, for each n, two sequences of measurable sets {M,,.} (May € Y, 
k = 1,2,---) and {M)x} (Mi, &’, k = 1, 2, ---) such that 


M.cC> Max, > m(Mnx) < m(M,) + en, 
k=1 k=1 


xn 


MLc>> Mau, > m(Mi.x) < m(M)) +e, 
k=1 k 


, ; ‘ € 
where ¢, is taken so small that we have e,m(M,) + e,m’(M,) + €& < grat: 


Consequently we have 


M*c > M.XMLCD DD Man X Min, 
n=l n=l k=l l=l 
SEY mM x My) = SOE mdm») 
n=l k=l l=l n=l k=l l=l 


< ) > (m(Mn) + €n)(m(Mi) + e9.) < ye 


n=l 


(m M,)m'(M),) + sn) 


n=l 


< m*(M*) + «, 
which proves that the property (6.2) is true for the collection *. 











STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 37 


We shall next prove that, conversely, the proper separability of 2*(B*, m*) 
implies that of Q(B, m) and Q/(@’,m’). Let A* = {Me} (M* e B*, n = 1, 
2, ---) be a basis of 2*(%*, m*) which has the property (6.2). By the defini- 
tion of the product measure space, there exist, for each M* and for each posi- 
tive integer p, two sequences of measurable sets {Myx} (MapxeB, k = 
1, 2, ---) and {Mj px} (M...».t eB’, k = 1, 2, ---) such that 


(6.3) M* oa } ® Mn,p.k x Mis: 
k=l 
(6.4) > mM(M n,n m (Mi. pk) < m*( M*) +5 
We shall show that the countable collection Y% = ens (n, p, k = 1, 2, ---) 


has the property (6.1) in the measure space (8, m). 
In order to prove this, let M ¢«&%, let ¢ be an arbitrary positive number, and 


let e’ = e-m’(Q’). Let us consider the set M* = M XK QC O*. M* clearly 
belongs to &*. Since the collection A* = {M*} (n = 1,2, -- -) has the property 
co] 


(6.2), there exists a subsequence {Mz} (¢ = 1, 2, ---) such that M* C > M3, 


i=] 


and >, m*(M*.) < m*(M*) + = By putting n = n; and taking p; larger 


i+l 


than : ,- in (6.3) and (6.4), we have 
€ 


M*c>oM.c>d p> Bisa Wasains 
i=l k=l 


i=l 
m*(M*) < >> > m(Ma,.p,.6m'(M ht) 
i=1 k=l 
< ¥ (mas) + 5 + — “)< m*(M*) +’, 
i=1 
or m*(A*) < ¢’ if we put A* = >> >> My, .»,.2 X Mh,.n;.2 — M*. Consequently, 
t=] k=l 
by Fubini’s theorem, there exists a point w’ ¢€ 2’ such that m(A*(w’)) < €, where 
A*(w’) is the set of all points w «2 such that w* = (a, w) eA*. Hence, if we 


denote by Dy the sum over all pairs of 7 and k for which w’ ¢ M},, yx, then we 
have MC > Mn;.p;.% and m(2_* M,,,,»;,4) < m(M) + e, which proves that 


the property ‘6.1) holds for the ‘ae %. Since the same thing can be 
proved for the collection %’, the proof of Lemma 5 is completed. 


Lemma 6. Let 2*(B*, m*) be the product measure space of two measure spaces 
Q(B, m) and 2'(B’, m’). If 2*(B*, m*) has a separating sequence of measurable 
sets, then there exist two sets of measure zero N C Q and N’ C © such that each 
of the remaining parts 2 — N and Q — N' has a separating sequence of measurable 


sets. 











38 WARREN AMBROSE AND SHIZUO KAKUTANI 


Proof. Let {M%} (M% «B*, n = 1, 2,---) be the separating sequence in 
Q*. For each M* and for each positive integer p, there exist two sequences of 
measurable sets {Mn,px} (Mnp.x¢B, & = 1, 2,---) and {Mi nn} (Mion eB’, 
k = 1, 2, ---) such that 


M* a > Ma.» 4 M...o2 = M..,; 
k=l 


x 


> m(Mn.px)m'(Mip.c) < m*(M7) +; 


Let us put Ni = [[] M?., — Mt, n= 1,2,---. Then it is clear that 
oak 

m*(N*) = 0,n = 1,2,---. Hence, by Fubini’s theorem, there exists a point 

w’ € 2’ such that “m(N* (w")) = 0, n = 1, 2,---, where we denote by N2(w’) 


the set of all points w e2 for which w* = (a, w’) « N*. We shall prove that if 
= b N*(w’), then the sets {Ma.p.-(Q — N)} (n, p, k = 1, 2, ---) forma 
=1 


separating sequence in Q2 — N. 

Indeed, if there exist two points w, ve2 — N which belong to the same 
members of {Mi ».-(2Q — N)} (n, p, k = 1, 2,---), then the two points 
w* = (w, w’) and v* = (v, w’) must also belong to the same members of { M,.p.% X 
M'..»x«} (n, p, k = 1, 2, -+-), and consequently to the same members of {M7} 


(n, p = 1,2,---). Since w* and v* do not belong to N* for n = 1, 2, --- , the 
same thing must be true for the sequence {M2} (n = 1, 2, -++), which is a 
contradiction to the apenin property of {M*} (n = 1, 2,---). Hence 


{Mn.pa(2 — N)} (nm, p, k = 1, 2, ---) is a separating sequence in Q — N. 
Since the same thing can be prov ed for the measure space 2’(%’, m’), the proof 
of Lemma 6 is completed. 

Lemma 7. Let &(R, m) be a properly separable measure space which has a 
separating sequence of measurable sets, and let {T,} be a measurable flow defined 
on it. _Then there exists an invariant null set N C © and a sequence of measurable 
sets {M,} (M, €&, n = 1, 2, ---) contained in % = 2 — N such that 
(6.5) {M,} (n = 1, 2, a9 -) ts at the same time a separating sequence and a basis 
for the measure space %(B, m), 


(6.6) if we denote by ¢,(@) the characteristic function of the set M, , then on(a:) 
is a Lebesgue measurable t-function for each & €M% , and 


iF a m 
: I ona) dt — vn(a) 


as e— 0 for all & €Q (without any exception). 
Proof. By Theorem 1, © is divided into two invariant measurable sets 2, 


and % in such a way that {7} is completely improper on ® and proper on % . 
It is clear that the measure spaces 2,(, m) and %(&, m) are both properly 














STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 39 


separable and that each of them has a separating sequence of measurable sets. 
Hence it is sufficient to prove this lemma only in the two special cases, namely, 
when {7,} is completely improper and when it is proper. 

In the first case we know, by Theorem 2, that {7} is the identity. Hence 
we can take as {M,,} (n = 1, 2, ---) the union of the basis and the separating 
sequence of measurable sets of &(&, m). 

In order to prove our lemma in the second case, we may assume, by Theorem 
4, that |7,} is a generalized flow built under a function, and since (in Theorem 
4) such a flow was obtained by putting together a countable number of flows 
built under a function, we have only to consider the case when {7',} itself is a 
flow built under a function. Hence we assume that {7';} is a flow built under a 
function, and throughout the remainder of this proof, we adopt the notation of 
Definition 6 (Q = &). Moreover, by the proof of Theorem 4, we may assume 
that there exists a positive constant c > 0 such that f(w) > c for all w €Q. 

We shall first show that there exists in 2 a null set N ¢& which is invariant 
under S and also a sequence {M,} (M, €&, n = 1, 2, ---) of measurable sets 
contained in 2, = 2 — N which is at the same time a separating sequence and 
a basis for a properly separable measure space %(B, m). 

For this purpose, consider a subset 2 of © defined by 


2. = [a:0 S Gia) < cl]. 


Then we have 2, = Q X [0, c), where [0, c) is a semi-open interval 0 S u <c 
of real numbers, and it is easy to see that the measure space 2.(B, m) is the prod- 
uct measure space of 2(B, m) and the interval [0, c) taken with Lebesgue meas- 
sure. Hence, by Lemmas 5 and 6, the measure space 2({8, m) has a countable 
basis, and after throwing out a null set M of measure zero, the remainder 2 — M 


+00 


. . r ke 
contains a separating sequence of measurable sets. Let us put N = be S*(M). 


Then N is an invariant null set, and it is clear that 2 = Q — N has both a basis 
and a separating sequence of measurable sets. 

Now we are ready to obtain the required invariant null set N and the se- 
quence of measurable sets {M,} (n = 1, 2, ---) with the properties (6.5) and 
(6.6). We first define WN as the set of all points & of the form & = (a, u), w € N, 
0 < u < f(w). N is clearly an invariant null set in. Let us further define 
the sets Mio. by 


Mio. = (M,, x a, b))(Q = N), 


where a and b are two rational numbers such that 0 < a <b < +2, and 
[a, b) is a semi-open interval a < u < b. It is clear that these M,,..». are the 
required sets. We have only to rearrange them into a simple sequence. 


DEFINITION 11. Let 2(B, m) be a measure space, and let 2 be a topological 
space at the same time. A measurable set M ¢ & is called regular if for any 
e > 0 there exists an open measurable set O « B such that M C O and m(O) < 








40 WARREN AMBROSE AND SHIZUO KAKUTANI 


m(M) + «. A measure space is called regular if every measurable set M ¢ & is 
regular. 

DEFINITION 12. A measure space 2({8, m) is an M-space if it satisfies the 
following conditions: 

(6.7) Q is a separable metric space, 
(6.8) every open set is measurable and has a positive measure, 
(6.9) Q(B, m) is a regular measure space. 

Derinition 13. A flow {7} defined on a topological measure space is called 
continuous if T,(w) is a continuous function of two variables w and ¢. 

THeoreM 5. Let {7;} be a measurable flow defined on a measure space which 
is properly separable and has a separating sequence of measurable sets. Then 
{T.} is isomorphic to a continuous flow on an M-space. 

Proof. We first apply Lemma 7 and find an invariant null set NV and a 
sequence of measurable sets {M,} (n = 1, 2,---) with the properties (6.5) 
and (6.6) of the sets N and {M,} (n = 1, 2, ---) of that lemma. (We change 
to the simpler notation because it will not be necessary here to represent our 
flow as a flow built under a function.) 

Let us denote by ¢,(w) the characteristic function of the set 1/,. For each 
weQ = 2 — N, define a double sequence of functions {22 ,n(f)} (n,m = 1, 
2, “* -) by 


Seat) = m | ¢n(wrrs) ds, —-axe<i<+o, 
0 


It is clear that all these functions are continuous and uniformly bounded: 
0 < xe.n(t) S 1. Let us define the distance d(w, v) of two points w and v of 


% by 


(6.10) dw,v)= > > Dae sup | re m(t) — rrim(t) |. 


n=1 m=1 k=l jt] Sk 
We shall prove that the following conditions are satisfied: 
(6.11) %& = 2 — Nis aseparable metric space with d(w, v) as a distance, 
(6.12) all ope sets of Q) are measurable, 
(6.13) m(M) is a regular measure on Q , 


(6.14) {7;} is a continuous flow on %. 


In order to prove (6.11), let = be the space of all sequences of func- 
tions £ = {2n,m(t)} (n, m = 1, 2, ---), where each z,,(¢) is a real valued con- 








ne 


od 


ch 


of 


C- 
1- 








STRUCTURE AND CONTINUITY OF MEASURABLE FLOWS 41 


tinuous function defined for —» < t < + such that 0 S z,,,(t) S$ 1. As 
is well known, = is a separable metric space with respect to the distance 


(6.15) d(t,n)= > D> D2 sup | tam(t) — Yam(t) |, 
n=l m=1 k=1 j¢| sk 


where — = {2n,m()}, 7 = {Yn.m(t)} (n,m = 1,2,---). If we now put” &(w) = 
jan m(t)} (n, m = 1, 2,---), then w — &(w) is a mapping of 2) onto a subset 
=, of =. Since x2 (0) — ota as m — +. (this follows from (6.6)), and since 
w * v (w, v € Q%) implies the existence of an n for which ¢,(w) # ¢n(v) (this fol- 
lows from (6.5)), this mapping gives a one-to-one correspondence of Q) and 
=). Since (6.10) and (6.15) give the same distance for corresponding points, 
and since, as a subset of a separable metric space =, =p itself is also metric and 
separable, the proof of (6.11) is completed. 

Since (6.11) is already proved, in order to prove (6.15), it is sufficient” to 
show that every sphere in 2 (defined by the metric d(w, v)) is measurable, and 
this will be done if we can prove that for each fixed v, the distance d(w, v) is a 
measurable function of w. By the definition (6.10) of d(w, v), it is sufficient 
to show that, for each fixed v, the w-function ~n,_(w) = sup | Ln m(t) — 2n,m(t) | 

reve A 


is measurable. This is, however, a direct consequence of the following two 
facts: (1) since x2 m(t) and 2>,,m(t) are both continuous in ¢, the sup in the 
formula above may be replaced by the sup for all rational numbers ¢ which 
satisfy |¢| < k; (2) for every fixed ¢, x2 ,»(é) is a measurable function of w. 

Now we proceed to the proof of (6.13). We shall prove that for any M «& 
there exists a sequence of open sets {O,} (k = 1, 2,---) such that MC O, 
(k = 1, 2,---) and m(Q, — M) — 0. Since the measure space 2(%, m) is 
properly separable with {M,} (n = 1, 2, ---) as a basis, it is sufficient to show 
this for each M,. In order to prove that each M, is regular, we define the 
sets On. by 


Onn = > [w: 2% m(O) > 4), k= 1,2,--- 


m=k 


Then it is clear that each O,,, is open and that we have VM, = I On... (This 


means that M,, itself is a G;-set.) The proof of (6.13) is eam: 
Lastly, we have to prove (6.14). This can be proved easily by a standard 
method, using the fact that 


| Tnvm(t) — rim(s)| S 2m\t—s|I, 
which is clear from (5.4) and the definition of x2 ,»(t). We omit the proof. 


19 The idea of mapping a flow into the space of functions of real variables in such a way 
that the given flow goes into the translation flow on function space has been exploited by 
J. L. Doob in [2], Theorem 10, p. 769. 

2” This is sufficient since in a separable metric space every open set is a sum of a count- 
able number of spheres. 











42 WARREN AMBROSE AND SHIZUO KAKUTANI 


Thus we have proved (6.11), (6.12), (6.13) and (6.14). Now, let No be the 
set of all points w ¢€ 2), such that there exists a sphere of measure zero contain- 
ing this point. Then No is an open subset of 2 , and since by the separability 
of 2) , No is expressed as a sum of a countable number of such spheres, No itself 
must be of measure zero. Moreover, since the flow {7;} is continuous on Q , 
N, must be an invariant set. Consequently, if we consider the space Qo = 
& — No = 2 — N — No, then Q% is an M-space and {7} is a continuous flow 
on Qo. Since the excluded set N + Np is an invariant null set, the proof of 


400 


Theorem 5 is completed. 


BIBLIOGRAPHY 


0. W. AmBroseE, Change of velocities in a continuous ergodic flow, Duke Mathematical 
Journal, vol. 8(1941), pp. 425-440. 

1. W. Amprose, Representation of ergodic flows, Annals of Mathematics, vol. 42(1941), pp. 
723-739. 

2. J. L. Doos, One-parameter families of transformations, Duke Mathematical Journal, 
vol. 4(1938), pp. 752-774. 

3. E. Horr, Ergodentheorie, Berlin, 1937. 

4. J. von NEuMANN, Zur Operatorenmethode in der klassischen Mechanik, Annals of Mathe- 
matics, vol. 33(1932), pp. 587-642. 

5. S. Saxs, Theory of the Integral, Warsaw, 1937. 

6. N. Wiener, The ergodic theorem, Duke Mathematical Journal, vol. 5(1939), pp. 1-18. 


INSTITUTE FOR ADVANCED Stupy. 

















THE DECOMPOSITION OF MEASURES, II 
By WARREN AMBROSE, Paut R. HAatMos, AND SHizvo KAKUTANI 


The main purpose of this paper is to prove that a flow on a measure space 
may be split into ergodic parts. The first result of this type is due to von 
Neumann,’ who worked with metric, complete, separable spaces. For some 
important applications, particularly in probability theory, it is necesary to 
dispense with these topological assumptions. We shall make extensive use of 
the terminology, notation, and results of (D)* and (S).* Incidentally, we find 
it necessary to relax somewhat the strict separability conditions of (D), redefine 
“direct sum’’, and prove a generalization of the general decomposition theorem 
of (D). As for the decomposition of a flow, we have chosen to reduce this to 
the case of a single measure preserving transformation by means of Th rem 4 
of (8). 

A measure space Q(9, m) is properly separable if there exists a strictly sep- 
arable Borel field & € OM, such that for every M e ON there is a B eB with 
M © B and m(B — M) = 0. Throughout this paper we assume that all 
measure spaces considered are properly separable and complete, in the sense 
that any subset of a measurable set of measure zero is itself measurable. Q 
is said to be a direct sum of the measure spaces Y,(“Y,, vz) formed with respect 
to the measure space X($X, u), in symbols 


(IM, m) =f Yee, ve) dul), 
X(N, w) 

if the conditions of §3 in (D) are satisfied, with the exception that we require 

only that for every M «9, MY, be a measurable subset of Y, for almost every x. 


TueoreM 1. Jf Q(9N, m) is a measure space and @ a Borel field, Q | OM, 
then there exists a set A €(@ of measure zero such that Q — A is a direct sum, 


2-A= / Y,dyp(z), 
x 


in such a way that the Borel field XC of all measurable x-sets is contained in and is 
equivalent to the given Borel field @. 


Proef. Let B be a strictly separable Borel field related to ON as in the defini- 
tion of proper separability, and let @’ be any strictly separable Borel field con- 


Received June 13, 1941. 

1J. v. Neumann, Zur Operatorenmethode in der klassischen Mechanik, Annals of Mathe- 
matics, vol. 33(1932), pp. 587-642. See p. 617. 

2(D): Paul R. Halmos, The decomposition of measures, Duke Mathematical Journal, 
vol. 8(1941), pp. 386-392. 

3 (S): W. Ambrose and 8. Kakutani, Structure and continuity of measurable flows, Duke 
Mathematical Journal, vol. 9(1942), pp. 25-42. 


43 





44 W. AMBROSE, P. R. HALMOS, AND S. KAKUTANI 


tained in @ and equivalent to it. We may take @’ ¢ %. Now for the moment 
we consider only the Borel fields & and @ in Q, and apply Theorem 1 of (D); 
this tells us that there exists a set A ¢(@’, and therefore A ¢ (@?, of measure zero 
such that 
2(B, m) -A= / 7 YA%, vz) dy(zx), 
X(N’, w) 

and such that the Borel field $C’ of all measurable z-sets coincides with @’ in 
Q—A. Let X and Y, be the Borel fields obtained from 9’ and %Y/ respectively, 
by adjoining all subsets of sets of measure zero. We shall prove that 


QM, m) -— A = / YA%,, vz) du(zx). 
X(%X,u) : 
In other words we shall prove that for each M «9, MY, €Y, for almost all z, 
and that the integral formula 


(1) m(M) = | ».(MY.) du(c) 


is valid. From the definition of proper separability we know that there exist 
two sets B, and B, in 8 with B, ¢ M ¢ B: and m(B, — B,;) = 0. Moreover, 
B,Y, and B,Y, are measurable (in Y,) for all x, v.(BiY,) and v,(B2Y,) are 
measurable functions of x, and the integral formula (1) is valid with B, or B, 
in place of M. This implies that v.(B,Y,) = v.(B2Y,) for almost all x, so that 
for almost all x, MY, is measurable and 


v.(BiY.) = v.(MY.) = v.(B2Y,). 


Since m(B,) = m(M) = m(B,), this implies that (1) is valid for all M «OM. 

We can apply Theorem 1 to prove a decomposition theorem for measure 
preserving transformations which differs in two ways from Theorem 2 of (D). 
It applies to the case where the space is properly separable rather than strictly 
separable, and it does not require any separability assumptions on the collection 
of invariant sets. The first of these differences is unimportant and merely 
makes the theorem a little more general, but the second is essential for any 
effective applications since in general the invariant sets will not form a strictly 
separable (nor even a properly separable) Borel field. The proof is essentially 
the same as the proof of Theorem 2 of (D), except that a more ingenious argu- 
ment (due to von Neumann) is necessary in order to prove ergodicity on the 
spaces Y, which we obtain. For this purpose we shall make use of the follow- 
ing simple consequence of the Birkhoff ergodic theorem.’ A necessary and suffi- 
cient condition that a measure preserving transformation 7 on a measure space 
Q(ON, m) be ergodic is that for some sequence B, , B2, --- of measurable sets, 
which span a Borel field be equivalent to 9M, 


1 ¥= , 
W 2 Bn, To) 
‘See Lemma 1, (D). To obtain @’ C 8 we may replace {8 by the Borel field spanned by 


@ and &. 
5 E. Hopf, Ergodentheorie, Berlin, 1937. See pp. 49-54. 





CO 
th 


col 


wh 


(Tl 


est 


lent 
D); 


zero 








THE DECOMPOSITION OF MEASURES, II 45 


converge almost everywhere, as N — «, to a constant independent of w, (where 
¢(B, w) is the characteristic function of B). 

THEOREM 2. If T is a measure preserving transformation, then there exists an 
invariant set A of measure zero such that Q — A is a direct sum, 


2-A= / Y,du(z), 
x 


in such a way that each Y, is invariant under T and T is an ergodic measure pre- 
serving transformation on Y; . 


Proof. Let @ be the Borel field of invariant sets and apply Theorem 1 to 
Q(ON, m) and @. By a slight modification of the corresponding part of the 
proof of Theorem 2 in (D) we prove that 7 on Y, is measure preserving for 
almost all x. We shall explicitly prove only the ergodicity. 

Let B; , Bz, --- be a sequence of measurable sets which span a Borel field 
B equivalent to Ml. We shall prove that for almost every fixed z, 


N 


| 
~ 


o(B, Y; ’ - w) 


=| - 
Me 
lI 
oe 


converges to a constant independent of w. Since we know, from the ergodic 
theorem, that 

1 N-1 . 

— >) (Bn, T*w) 

i k=0 
does converge almost everywhere (with resnect to the measure m) to a measur- 
able function f,(w), this amounts to proving that f,(w) depends on 2 alone. 
Since f,(w) is invariant under 7, f,(Tw) = fn(w), we know that f,(w) is measur- 
able (@). Since X is equivalent to @, we may find functions f, (w), which are 
measurable (XC), so that f,(w) = f,(w) almost everywhere. Increasing the 
set A of measure zero, described in Theorem 1, so as to include all points in 
every Y, on which T is not measure preserving, and all points w for which f,(w) 
is not defined, or for which f,(w) ¥ fn(w), n = 1, 2, --- , we obtain the desired 
result. 

The ergodic theorem can also be used to prove not only the ergodicity but 
also the existence of the decomposition. For every measurable set B the 
averages 

1 N-1 > 

N > ¢(B, T* w) 
converge almost everywhere to a function 6(B, w) (depending, of course, on B), 
which is invariant under 7. Moreover, for any invariant set A, we have 


| a8, w)dm(w) = lim Ly [ , T* w) dm(w) = m(AB). 


(The integrability of 8 and the legitimacy of term by term integration are easily 
established.) A comparison with §4 of (D) shows that 6(B, w) has the defining 


me 





46 W. AMBROSE, P. R. HALMOS, AND S. KAKUTANI 


properties of the 6(B, w) described there. Consequently, 6(B, w) = 6(B, w) 
almost everywhere. The chief difficulty in the proof of Theorem 1 in (D) was 
the application of the Radon-Nikodym differentiation theorem to prove the 
existence of 5(B, w): the equality just proved suggests another approach, by 
invoking the ergodic theorem and defining 6(B, w) to be the Jimit of the averages 
discussed above. This approach leads to the same technical difficulties as 
before in connection with the proof that for fixed w, 6(B, w) is a measure on 
OM, but is otherwise a little simpler than the proof using the Nikodym theorem. 
It has the disadvantage of applying to Theorem 2 only; the differentiation seems 
to be indispensable in the proof of Theorem 1. 


TueoreM 3. If T, is a measurable flow on a measure space AON, m) which 
contains a separating sequence of measurable sets,’ then there exists an invariant 


set A of measure zero such that 2 — A is a direct sum, 


a / fats), 
x 


in such a way that each Y, is invariant under T, and T;, is an ergodic measurable 


flow on Y,. 


Proof. By Theorems 1, 2, and 4 of (S) it is possible to split 2 into two sub- 
spaces. For one of these the flow is the identity (for which this theorem is 
trivial). The other can be split into a countable sum of invariant measure 
spaces on each of which the flow is (isomorphic to) a flow built under a function. 
Hence, in the remainder of this proof, we shall assume that 7’, is built under a 
function and we shall use without explanation all the notation of Definition 6 
in (S).’. We shall prove this theorem for 7, by applying Theorem 2 to the 
transformation S on Q, but for this purpose we must know that © is properly 
separable; this follows from Lemma 4 of (S) and the proper separability of ©. 

Applying Theorem 2 to S we find a set A of measure zero invariant under S 
such that 


(2) Q2-A= a YAY, vz) du(x), 


and such that Y, is invariant under S and S is ergodic on Y,. We define the 


function f.(w) on Y, by f.(w) = f(w) for w « Y, , and we define the space Y, by 
Y, = &(Y, X R), (where R denotes the real line); we make Y, into a measure 
space, Y, = Y.(%., 92), by defining measure multiplicatively in terms of vz 
on Y, and Lebesgue measure on R. It follows from the definition of a direct 
sum that f.(w) is measurable (Y,), except possibly for an x-set, Xo , of measure 


zero. We define A by 


6 See Definition 8in (S). This theorem could be stated more generally, without assuming 
the existence of a separating sequence, but the proof would then involve a consideration of 
improper flows, (see Definition 7 of (S)). The present case is the only one of interest. 

7 We change one detail in the notation of (S) by writing 2 ON, m) and Q (OIL, m) in place 
of & (B, m) and 2 (B, m). 








hich 
tant 


able 


sub- 
n is 
sure 
ion. 
ra 
n 6 
the 
rly 
oO. 
r S 








THE DECOMPOSITION OF MEASURES, II 47 


A=HAXR)+ DY; 


zeXg 


we shall show that 
(3) a-A=[  Pedula), 
X-—X9 


and that this decomposition has all the properties asserted in the theorem. 

It is clear, since S leaves A and each Y, invariant, that 7, leaves A and each 
Y, invariant; also, from the proper separability of each Y,(Y. , vz) we conclude 
that each Y.(Y, , 2) is properly separable. We know that, on each Y,, S is 
an ergodic measure preserving transformation, and that (for ceX — Xo) 
f-(w) is measurable (Y,); since T;, when considered only on Y,, is the flow 
built under f,(w) on the transformation S, it follows that 7, is an ergodic meas- 
urable flow® on each Y, , (x eX — X)). 

To complete the proof we need only establish the direct sum relation (3). 
For this purpose let iio be the collection of all sets M ¢ & — A satisfying the 
following conditions: 

(i) MeO: 

(ii) MY, ¢ %, for almost every x «eX — Xo; 

(iii) 5(MY.) is measurable (%); 

(iv) mit = | — o.(if¥.) dulc). 

X—Xo 
We shall show that Oho = OM. 
If M © 2& — A is of the form 


(4) M = ‘a:F(é) > c}{a:a S Ga) < b}(M X R), 


where M ¢ ON, and 0 S a S b S ¢, then it is trivial (from the direct sum rela- 


tion (2) and the multiplicative definition of the measures m and p,) that M ¢ Mito. 
It is also easy to see that if M ¢ 2 — A is of the form 


(5) M = {a:G(e) > c}(M X R), 


where M «9, then M is a countable sum of disjoint sets of the form (4), and 
hence that M «iio. Since every set in the field determined by sets of the 
form (4) or (5) is a finite sum of pairwise disjoint sets of this form, it follows 
that every set in this field is in Milo. Since, finally, Dio is clearly a normal 
class, and since along with a set of measure zero 9flo contains all of its subsets, 
it follows that Mio = ON.” 





Tue INsTiITUTE FoR ADVANCED Stupy. 





8 We use here the following easily proved statement: if T; is a flow built under a function 
then the measurability of the function implies the measurability of the flow. It is trivial 
that a flow built under a function is ergodic if and only if the transformation on the base 
space is ergodic. 

® We use here the theorem (see 8S. Saks, Theory of the Integral, Warsaw, 1937, p. 83) that 
the normal class determined by a field coincides with the Borel field determined by it. 





THE FUCHSIAN EQUATION OF SECOND ORDER WITH 
FOUR SINGULARITIES 


By A. Erp&tyI1 


1. In this paper the solutions of the Fuchsian equation of second order with 
four singularities are investigated by means of series of hypergeometric functions. 

A linear differential equation of second order with four singularities which 
are “regular points” (this name is due to Thomé whereas Fuchs himself used 
the term “points of determinateness’’; both names appear to be rather inade- 
quate) can be reduced by a linear transformation of the variables to the equa- 
tion defined by the Riemannian scheme 


( 0 l a 20 ) 
(1.1) Pi 0 0 0 a 2), 
l-y 1-6 Il-e 8 


where the exponents are connected by Ricmann’s relation 
(1.2) a+B-y-6-—e+1=0. 


Heun’s equation defined by the scheme (1.1) is of considerable theoretical 
interest, for it is the simplest equation of Fuchsian type the coefficients of which 
are not determined uniquely by the singularities and the exponents attached 
to the singularities. In fact, in Heun’s equation (3.1) there is a constant h 
which is quite arbitrary from the point of view of the scheme (1.1) and is thus 
an accessory parameter according to the terminology of F. Klein. From the 
practical point of view, Heun’s equation is of some interest, for many of the 
differential equations occurring in the applications of analysis are special or 
limiting cases of Heun’s equation. It is sufficient to recall that the hyper- 
geometric and confluent hypergeometric equations, the differential equations of 
Lamé, Mathieu, Legendre, Bessel and Weber, those of the polynomials of Jacobi, 
Tchebicheff, Laguerre and Hermite as well as that of Bateman’s k-function 


belong to this class. 


2. The simplest way of representing the fundamental branches of the func- 
tions defined by the scheme (1.1) is to try power-series. These represent the 
functions in a circle with one singularity on the boundary and another singu- 
larity at the center of the domain of convergence. An alternative plan, pro- 
posed in this paper, consists in expanding the solutions of Heun’s equation into 
certain series of hypergeometric functions. In this way, in a certain sense, 
three singularities may be taken into consideration. The series are convergent 
in a domain the boundary of which is an elliptic limacgon with two singularities 
as foci and a third singularity on the circumference. Though these series may 
offer some points of interest even in the general case (i.e., with arbitrary values 


Received June 20, 1941. 
48 








com> m— —) we 








FUCHSIAN EQUATION OF SECOND ORDER 49 


of h), yet it is only in a certain important exceptional case that their usefulness 
is fully revealed. 

It is well known that Heun’s equation has a solution regular at two singulari- 
ties of the differential equation, if h has certain values satisfying one of a set of 
transcendental equations. Following the corresponding usage in the case of 
the equations of Lamé and Mathieu, it seems appropriate to use the term 
Heun function (of the first kind) for a solution which is regular at two of the 
four points 0, 1, a, ~. 

Heun functions have been the subject of several investigations. Lambe and 
Ward derived integral equations for such functions. Actually, Lambe and 
Ward suppose 1 — a or 1 — 6 to be a positive integer. This causes the Heun 
functions to be regular at three singularities (Heun polynomials). This re- 
striction is however by no means necessary; similar theorems hold for transcen- 
dental Heun functions, with arbitrary values of 1 — a and 1 — 8. Svartholm 
investigated the expansion of Heun functions in series of Jacobi polynomials. 
His series are convergent in an ellipse the foci of which are the two singularities 
of Heun’s equation in which the Heun function is regular. One of the remain- 
ing two singularities is on the circumference of the ellipse, except when 1 — a 
or 1 — @ is a positive integer when we have a terminating series representing a 
Heun polynomial in the whole plane. 

Leaving aside Heun polynomials for the moment, we see that transcendental 
Heun functions are represented by their power-series in a circle, by their Jacobi- 
series in an ellipse of convergence. The series of hypergeometric functions 
dealt with in this paper is convergent in the exceptional case of Heun functions 
in the whole plane of the complex variable x. The domain of convergence thus 
including the two singularities of the Heun function, the behaviour of this 
function at its singularities may be studied in every detail. So far as I am aware, 
this is the first step towards an explicit knowledge of the monodromic group 
of Heun’s equation. 

Another useful feature of our series shows itself when dealing with the general 
solutions of Heun’s equation in the exceptional case. Svartholm’s series do not 
appear to be suitable for representing any other solution than Heun’s function 
of first kind. The power-series and the series of hypergeometric functions, 
however, work as well as in the general case. Besides, series of hypergeometric 
functions represent the general solution outside of a certain limagon. This 
domain containing two singularities of Heun’s equation (namely those at which 
the transcendental Heun function is singular), the general solution at these 
singularities can be determined. Hence our knowledge of the monodromic 
group of Heun’s equation may be completed as far as the singularities of Heun’s 
function are concerned. Heun’s equation has two more singularities (at which 
Heun’s function is regular); our knowledge of the transformations, related to 
these singularities, of the monodromic group is less complete. 

Let us return now to Heun polynomials. These are represented by terminat- 
ing series and hence the question of convergence does not arise at all in this 








50 A. ERDELYI 


case with functions of the first kind. Yet here too our series are of consider- 
able advantage. While the power-series representation of the general solution 
is still only convergent in a circle, the representation of the general solution as 
given by our method consists of a finite linear combination of hypergeometric 
functions and is valid in the whole plane. In this case the knowledge of the 
monodromic group is complete. 

Also there are some interesting relations connecting two different series repre- 
senting the same Heun function. 


3. The results merely outlined in the preceding section are essentially the 
generalizations of some results on Lamé functions which will be published else- 
where. In the present case too, the type of the series to be used could be in- 
ferred from the integral equations satisfied by Heun functions. For the sake 
of brevity, however, I omit this deduction and start directly assuming the form 
of the series and verifying that it satisfies Heun’s equation if the coefficients 
satisfy certain recurrence formulas. Also some other results, mentioned only 
in the preceding section, will not be proved in extenso. The content of the 
following pages in connection with my more detailed exposition of the corre- 
sponding investigations on Lamé functions will certainly enable the reader to 
clear up to his own satisfaction every detail. 

I thought it expedient to restrict myself in the main part of this paper to 
Heun’s equation. I might be allowed some remarks, however, on the general 
import of the underlying method and on its applications to more involved 
types of differential equations. 

To fix the ideas, let us take a differential equation of second order of Fuchsian 
type (for the general theory of such equations see, for instance, [3],’ p. 370) 
with n singularities. There are n — 3 accessory parameters in such an equa- 
tion. The fundamental solutions may be expressed either as power-series 
(convergent in a circle) or as a series of hypergeometric functions (convergent 
inside an elliptic limagon). The coefficients satisfy (n — 1)-term and (2n — 5)- 
term recurrence formulas respectively. 

In the general case (i.e., with arbitrary values of the accessory parameters) 
there is only one singularity inside the domain of convergence. If the accessory 
parameters satisfy some of a certain set of transcendental equations, then there 
is at least one solution of the Fuchsian equation regular at two, three or more 
of the singularities. In this case in which the circle or limacgon of convergence 
increases correspondingly and alternatively, this regular solution (but not the 
general solution of the differential equation in this exceptional case) is expressible 
by a series of Jacobi polynomials, see [8], convergent inside a certain ellipse. 
The most interesting case is that in which the accessory parameters satisfy 
n — 3 independent equations so that they take one of a denumerably infinite 
set of characteristic values. In this case the differential equation has a solu- 
tion (of first kind) with only two singularities. While power-series and Svar- 


1 The numbers in brackets refer to the bibliography. 














FUCHSIAN EQUATION OF SECOND ORDER 51 


tholm’s Jacobi series represent this solution only in a finite part of the complex 
plane, the series introduced here are convergent in the whole plane and give 
full information about the monodromic group as far as the two singularities of 
the solution of first kind are concerned. The series into powers or those into 
Jacobi polynomials are convergent in the whole plane only if the solution of 
first kind is regular at n — 1 singularities of the differential equation, and this 
is only possible if a condition is imposed on the exponents. In the last men- 
tioned exceptional case our series are again more satisfactory, for they repre- 
sent the general solution of the differential equation as a finite linear combina- 
tion of hypergeometric functions. Also they yield the full monodromiec group. 

The question naturally arises if it is possible to represent by series convergent 
in the whole plane solutions regular at n — 3, n — 4, --- , 2 singularities of a 
Fuschsian equation with n singularities. The answer is in the affirmative. 
Such a representation is possible by series of certain functions satisfying a suit- 
ably chosen set of Fuchsian equations with 4, 5,---,m — 1 singularities. 
Though theoretically of some interest, owing to our yet imperfect knowledge 
of the solutions of Fuchsian equations with 4, 5, --- singularities, this procedure 
is not likely to be practicable. The general principle is, however, perfectly 
clear: representing solutions of Fuchsian equations by series of functions which 
satisfy Fuchsian equations of the same order with a smaller number of singu- 
larities. The usual power-series solution fits into this scheme being its lowest 
stage, for (ef., for instance, [10], §10.8) powers are the solutions of the Fuchsian 
equation with two singularities only. 

It is hardly necessary to mention that the same principle applies equally to 
Fuchsian equations of any order. 

In the rest of this paper I shall restrict myself, for the sake of brevity in the 
formulas, to Heun’s equation 


Ly] = 2(2 — 1)(2 — a) cy + {yl2 — 1)(2 — a) 
(3.1) a 
+ bx(x — a) + ex(x — 1)} in + aB(x —h)y =0 


defined by the scheme (1.1), Riemann’s relation (1.2) being supposed to be 
satisfied. 
4. It would seem plausible that some series of the sort 
0 1 x ) 
(4.1) bw m 0 a x 
ls-a—B—m 1-6 B 


should be able to render the behavior of the function defined by the scheme 
(1.1) in a domain including 1 and ~. 








52 A. ERDELYI 


There are six different fundamental branches of the P-function occurring in 
(4.1) which will be denoted by 


Pa —5+m-+1)P(8—-b+m+1) 
Tia +B—6+2m+1) 





om = om(a, B;5;2) = 
xX FlatmB+m;atB—6+2m+1;2); 


T(a + B aad 6 + 2m) a? o-O-= 


T(a + m)T(8+ m) 





on = ona, 83532) = 
xX F(é-a—m,i—-B—m;i-—a—B— 2m+1;2); 
o. = oh (a, 8;5;2) = T(1 — 8)x"F(a + m, 8B + m;6;1 — 2); 


(4.2) oh = ¢n(a, 8; 8; 2) 


rs _1)! hehe deo eA Ty 
Ma + m)T(B + m) 


» x) x” 





X Fla—-6+m+1,8-—6+m+1;2 —6;1—2); 


oo = gala, 8;6;2) = (—1)"T(a — 56 + m+ 1) rae ® (—x)* 
> 4 F(2 +m,ié—B—ml+a- @;4); 


¢, = o5(a, 8B; 8;2) = ¢3(B, a; 8; 2). 


Between these branches there are the well-known relations, for instance ({10], 
$§$14.53 and 14.51), 


4 5 6 
m = Om + Om 


1 3 
(4.3) ¢m = on + 
6 
Any linear combination ¢g, = >> A.gi, with coefficients A; independent of 
i=l 


m and 2 satisfies the relations 


A = she @ a +04 Ne t)— 08-9 
(4.4) 

+ {aa +™ (a+ 85+ me 
(a + m)(8 + m) ta! 
a+tB—5+2m+1 


— Om+1)} 





2); 


x); 


x); 


10), 


of 








FUCHSIAN EQUATION OF SECOND ORDER 53 


and 
em _ f(a + m)(a — 6+m+1) + (B+ m)(6-—6+m+1) 
zt | (atB—6+2m—1)(a+B —5+2m+ i) 
1 \ 
a 5m =i) 
(4.6) a+f8 + 2m ) 


(a + m)(8 + m) 
(a+ 6B—6+2m)(a+B—6+2m+1) 


(a —6+m)(8 — 6+ m) 
(a+ B—6+2m—1)(a+8—6+ 2m) 


oe 


Pm+1 


aa Pm—1 - 
The first of these formulas follows from the hypergeometric differential equation; 
the two others can be proved, e.g., by using the hypergeometric series and 
comparing coefficients of equal powers of x. The last relation corresponds to 
one of the relationes inter functiones contiguas. 

From Watson’s asymptotic representations of hypergeometric functions we 


have 
. Ym+1 ] — (1 — x)! h 
(4.7) lim = = i+ (1-2) when (1 — x)’ = Ois taken, 
m2 Om oes 2 
while 
a — } i 
(4.8) im =? = 1+ —=2) [RC — 2)! > 0} 


m0 Ym 1— (1 = z)} 


° . ° 1 
whenever ¢» is not a constant multiple of ¢,, . 
1 . ° ° 
Hence >» CmOm and Zz. CmGm are equiconvergent with the power-series 


1-—(1- zy (; +(1— ay 
Pas and on 
at (eae - de i—-(i-z) 
respectively. Let lim | ¢m4i/¢m| = 1/k (m — &). Then the respective do- 
mains of convergence are 


1+(1—2) 


1—(1—z)! 
1 — (1-2)! 


1+ (1-2)! 
If k < 1, then bie Cmom iS divergent, whereas > Cm@m iS convergent in the do- 
main |1 — (1 — ai|<kl1+a- z)'|. This domain is bounded by the 


(4.9) <k and <k [RQ —2z)' 20). 


curve 


_ = 3 
(4.10) ry Ta =k (0<k<1), 


which will be shown to be an elliptic limagon. 





54 A. ERDELYI 


Proof. From (4.10), 


,1_jl-(i-2)} 1+(1—2) 
die ‘es i¢a-—a)i* 1 — (1 — 2)! 
is _ »\t|2 — | 
_ (t=O sh P4242 2 ay 
| & | |X| 


Hence (4.10) may be written 


S(e + f)lei=1+ (1-2 or bet aS(e+t), 
This last equation shows that 1/z lies on an ellipse, i.e., (4.10) is the inverse 
on x = 0 of an ellipse with foci 0, 1 and hence an elliptic limacon with deuble 
focus x = 0 and single focus x = 1. 

If k > 1, then } » Cm¢m is convergent in the whole x-plane and b YmCm 1S 
convergent outside of the limacon 


1+ (i-z)/ _, (1 < k). 


(4.11) i-a-a 


5. Let us assume a solution of Heun’s equation in the form 


(5.1) y = Dd cmgm = Dd, Cmla, h; a, B, , 5, €)em(a, B; 5; 2). 


m=() 
From (3.1) and (4.4) we easily obtain 


L{>> Cm Om = _ Cm {(a _ xr) {(a + B + 1)x + 6-—-a- B iad L}on 
+ (a — 2)\a8 + = (a+B—-—6+ wheat {y(x — 1)(x — a) 


+ dx(x — a) + ex(x — 1)}en + a(x — h)gnl. 
Using (1.2), this simplifies to 
; a-z ) 
> emf ea - lhe. + {aB(a — h) + m - (a+6B—6+ mbm] 
\ r ) 
Employing (4.5) and (1.2) this last expression changes into 


(a + m)(8 + m) )} 
a+B—8+2m+1/){™ 


, 


be emf {aaa —h)—mla+8-—6+m)+ ca(m _ 


(a + m)(B + m) 
at+tBp—6+2m+1 


Finally, by the aid of (4.6), we write the last expression in the form 


a 
+ ea m+ + mM = (y+ m— Dem 


(5.2) Li CmOm| _ > Cm{ K miiem+1 + Lingm + M n-1¢m-1}; 





= 








FUCHSIAN EQUATION OF SECOND ORDER 55 


where 


— q et mE + mle + ma + 8B —5 + m) 
ads (a + B—5 + 2m)(a + B—5 + 2m + 1)’ 


‘ _ flat ma — 6 +m +1) +(6+m)(B—5 +m+1) 
tm = amy +m — Dep 8+ Im — Nath 8+ 2m 41) 


(5.3) 1 

ee ee me ma + 6B—6 +m) — aBh 

aB(y + 2m) — em(d — m — 1) _ 
a+Bp—b+2m+1 ™ 

(a —5+ m)(6 i+ m)m(y + m — 1) 

(a+ 6 —6+ 2m — 1)\(a+B—6+ 2m) 

Hence (5.1) is a formal solution of Heun’s equation if c,,(a, h; a, 8B, y, 4, €) is 

determined by the recurrence formulas 


Loto + Moy = 0, 


+a 


Ma-1 = @ 


(5.4) 
R ular + Lulu + M mCm41 _ 0 (m = l, 2, 3, '™ -) 


We shall always put c) = 1, and c,, shall denote throughout the rest of this 
paper the coefficients determined by (5.4). 
From the recurrence formulas (5.4), the transformation 


I(a)T(8)l(a — 6 +m + 1)P(B — 6 +m + l)en(a, h; a. B, y, 5, ©) 
(5.5) = (a + m)I(6B + m)I'(a — 6 + I)T(B — 6 + 1) 
Xen(a,h;a —6+1,8 —6+1,y, 2 — 6, © 
immediately follows. h and h; are connected by the relation 
(5.6) (a — 6+ 1)(8 —6 + 1)hki = aBh + y(1 — Sa. 
This transformation is in a way a counterpart to Euler’s transformation of the 
hypergeometric series which reads 
1 Mia—6+m+1)F(8-—6+m+1) 
gna, B; 8; x)= —— eer - 
(a -+ m)T(8 + m) 
x (1 — 2)" gn(a — 5 + 1,68 —6 + 1;2 — 6; 2) 


in the notation adopted in the preceding section. 


(5.7 


6. As to the convergence of the series (5.1), two cases must be distinguished. 
Having lim K,,/m” = lim M,,/m’ =-4a and lim L,,/m’ = 3a — 1 (m— @), 
we infer from a well-known theorem of Poincaré on linear difference equations 
((7], ef. also [5], p. 527) that lim cn+:/cm (m— ©) exists and is equal to one of 
the roots of the quadratic equation }ap’ + (3a — 1)p + 4a = 0 provided that 
the moduli of the roots of this equation be distinct, i.e., provided that a is not 











56 A. ERDELYI 


a real quantity 21. Henceforward we shall assume —2 < arg (1 — a) < z. 
The two roots of our quadratic are 
j 
_1+(1—a) 
1 — (1 —a)! 
Defining the square root uniquely by arg (1 — a)! = } arg (1 — a), of the two 
roots p; has the larger modulus. 
In the general case, i.e., if h has arbitrary values, we have lim ¢n4i/Cm = pi. . 
In the exceptional case, i.e., if h is a root of the transcendental equation ([6], §57) 


K,/M, K2/Ms2 K3/M3 


—1 


(6.1) pi and p2 = pi .- 


(6.2) Lo ‘My —_ 7 2a 0 
L,/M, aa L2/M2 ad L3/M3 saan F 
we have lim Cn4i/Cm = po (m — ~). In the present section we assume the 
general case and shall deal with the exceptional case later. 
1 se ° ° - 
In the general case k = | p, | (in the notation of section 4) and hence (5.1) 


° ° P ° 1 ° . 
is divergent unless ¢,, is a multiple of ¢,,. Hence there is only one solution of 
(3.1) of type (3.1), namely 


(6.3) Y1 = be Cn(a, h; a, B, 7’ 6, e)on(a, 8B; 5; x), 


m=iv 
convergent in the domain 
1— (1-2) 1 — (1 —a)' 
1+ (1 — 2)! 1 + (1 — a)! 
bounded by an elliptic limacgon as described in section 4; this limagon obviously 
passes through x = a. Clearly y; belongs to the exponent zero at x = 0 and 
it is yi(0) = 1. Hence 


(6.4) 


a. Mia + B—6 + 1) . 
(6.5) la—-s+ire@—-s+) yi = F(a, h; a, B, vy, 6, €; 2) 
is the fundamental solution studied by Heun of equation (3.1). Heun proved 
that any solution of (3.1) can be expressed in terms of his function (6.5). Hence 
any solution may be expressed in series of the type (5.1). 

(3.1) has 192 solutions of type (6.5) ({10], Chapter 23, Example 10), 48 of 
which have been given by Heun. The rest may be obtained by the transforma- 
tions of Heun’s series we are going to develop now. 

By the transformations (5.5) and (5.7) we have 
Pa) P(8)y = do V(a)T(B)emem 

= (1 — 2) “Tia — 6 + 1)T(B — 6 +1) Do enla, a; 
a-b6+1,8—6+1,7,2 — 4, denla —56+1,8 —6+1;2 — 6;2) 


and consequently 


(6.6) F(a, h; a, B, y, 6, €; x) 
= (1 — 2)'*F(a,mj;a —8+1,8 —5+1,7,2 — 5, €;2), 


where h and h; are connected by (5.6). 





wo 


sly 
nd 


ed 
ce 


of 


1a- 











FUCHSIAN EQUATION OF SECOND ORDER 57 


Also it is easily seen that 


(6.7) F(a, h; a, B, y, 6, €; x) = r(, = ; a, B, ¥, €, 5; *) 
a a a 


and applying (6.6) to the right hand side of (6.7), we obtain the third trans- 
formation 


F(a, h; a, B, y, 6, €; x) 


x ite (2 ho *) 
(6.8) = (1-2) F\ 7, cisa—et 1 B—et+1, 72-6557 


l—e 
(1 -2) Fla, h;a—e+1,8 —e+ 1, y, 6,2 —€; 2), 


where h and he are connected by the relation 
(6.9) (a — e+ 1)(B — € + 1l)hke = aBh + y(1 — ©). 


The series of hypergeometric functions corresponding to the right hand sides of 
(6.7) and (6.8) are convergent in a domain of the z-plane bounded by an elliptic 
limagon with double focus 0 and single focus a and passing through x = 1. 


7. For the rest of this paper let us suppose that A is a root of (6.2). Then 
k = |p|’ > 1 and consequently the series (6.3) is convergent in the whole 
z-plane and represents a solution regular at x = 0 and x = a. There is a 
branch-cut along the real axis extending from x = 1tox = +2. Thesolutions 


ee) 
(7.1) Yi = Dd em(a, h; a, B, 7’: 6, €)om(a, B; 6; x) (2 _ 2, Pali ee , 6) 
m=( 
are convergent outside of the elliptic Jimagon with foci x = 0 and 1 and pass- 
ing through x = a. Hence all solutions are convergent in domains of which 
x = landz = © are inner points. Thus the substitutions of the monodromic 
group are known as far as they are related to these two singularities. At x = 
l and x = « the monodromic group of (1.1) is isomorphic to the group of 
( 0 | 20 
| 
P, 0 0 a x 


lb-a-6B 1-6 B | 


ys and y, are the fundamental solutions at x = 1; ys and ye are the fundamental 
solutions belonging to x = ~. From (4.3) we obtain 


(7.2) Y= Ysty=Yyrty, 


and all the other relations from the theory of the hypergeometric function. 
In this exceptional case y,; may be developed into a series of Jacobi poly- 
nomials (Svartholm, loc. cit.) 





58 A. ERDELYI 


- = x 
(7.3) "= ¥ An F(—m,y tet m= 15452) 
m=() a 
convergent in a domain of the z-plane bounded by an ellipse with foci x = 0 
and x = a and passing through xz = 1. No other solution of (3.1) can be de- 


veloped into a similar series in general. I may mention without proof that the 
expansions (7.3) and (6.3) are connected with each other by Lambe-Ward’s 
integral equation. More precisely, the integral equation transforms (7.3) into 
(6.3), just as Whittaker’s integral equation for Lamé functions transforms the 
Fourier-Jacobi expansion of these functions into their expansion in series of 
Legendre functions. In fact, Lambe-Ward’s integral equation is equivalent 
with the co-existence and identity of (7.3) and (6.3). It seems that Lambe- 
Ward’s integral equation being restricted to the exceptional case is intimately 
connected with there being in the general case (i.e., with arbitrary values of h) 
no expansion of type (7.3). 

It is hardly necessary to mention that beside the exceptional case dealt with 
in this section, there are other exceptional cases in which Heun’s equation has a 
solution regular at two singularities other than 0 and a of this equation. All 
these cases are obtainable by linear transformations from the one dealt with 
here and so need not be considered separately. 

Also, there are certain exceptional cases in which there is a solution regular at 


three singularities. If we takea = —M, M = 0,1, 2,--- , and h to be a root 
of (6.2) (which is an algebraic equation of degree M in this case), we have 
K wai = O and cys; = 0. Hence also cys42 = Cyuy3 = --: = O and all series 


(5.1) terminate. y; is in this case a polynomial and identical with y; (Heun 
polynomial) ; ys is Heun’s function of the second kind. The further particulars 
are so similar to the corresponding results on Lamé polynomials that it is not 
necessary to go into details. 


REFERENCES 


1. A. Erpéiy1, Expansion of Lamé functions into series of Legendre functions, in press. 
2. K. Heun, Zur Theorie der Riemann’schen Funktionen zweiter Ordnung mit vier Ver- 
zweigungspunkten, Math. Annalen, vol. 33(1889), pp. 161-179. 
3. E. L. Ince, Ordinary Differential Equations, London, 1927. 
4. C. G. LamBe anv D. R. Warp, Some differential equations and associated integral equa- 
tions, Quart. J. Math. (Oxford), vol. 5(1934), pp. 81-97. 
5. L. M. Mitne-Tuomson, The Calculus of Finite Differences, London, 1933. 
6. O. Perron, Die Lehre von den Kettenbriichen, Leipzig, 1913. 
7. H. Porncarsé, Sur les équations linéaires aux différentielles ordinaires et aux différences 
finies, Amer. J. Math., vol. 7(1885), pp. 203-258. 
8. N. Svarruoim, Die Lésung der Fuchsschen Differentialgleichung zweiter Ordnung durch 
hypergeometrische Polynome, Math. Annalen, vol. 116(1939), pp. 413-421. 
9. G. N. Watson, Asymptotic expansions of hypergeometric functions, Trans. Cambridge 
Phil. Soc., vol. 22(1918), pp. 277-308. 
10. E. T. WarrraKer anp G. N. Watson, A Course of Modern Analysis, Cambridge, 1927. 


UNIVERSITY OF EDINBURGH. 











A GENERALIZATION OF THE EUCLIDEAN ALGORITHM TO 
SEVERAL DIMENSIONS 


By BarkKLEY Rosser 


Summary. The Euclidean algorithm is generalized to two, three, and four 
dimensions. The generalized algorithm is applied to the solution of the fol- 
lowing problems. 

Given a positive definite quadratic form, find integer values of the variables, 
not all zero, which make the value of the form a minimum. 

Given n linear forms in n variables with determinant A =~ 0, find integer 
values of the variables, not all zero, such that each linear form < | A | ai 
absolute value. 


Given n real numbers, 2, --- , 2, , not all rational, find as many sets, a, a , 
a2, °*** , @,, of integers as desired such that simultaneously 
1 a . 
lax; -a;| Sa" (¢ = 1,2, --+, 2m). 
Given 


the a;’s being coprime integers, find a general solution, in integers, of L = ki, 
namely 
ai = >> bik; (i = 1, -+-,n) 
j=l 


(where the b’s are fixed integers, k; is the same integer that occurs in L = k,, 
and the other i’s are arbitrary integers) such that >> (b;,)° shall be a minimum. 
Given a hynersphere, c, with center at the origin and radius 21, and 


n 


L = z Uiti, 
i=l 
the u’s being real numbers, find a lattice point distinct from the origin within 
(or on) o and as close to the hyperplane L = 0 as possible. 

Given two symmetric positive definite matrices, A and B, with real compo- 
nents, find whether there is a matrix P with integral components and determinant 
+1 such that B = P’ AP, and if so, to find all such P’s. 

We open the paper with some preliminary conventions regarding terminology. 

We shall use lower case italics from a to ¢ inclusive for rational integers, and 
from u to z inclusive for real numbers. We shall use upper case italics from 
A to Q inclusive for square matrices with real elements, and from R to Z in- 


Received August 9, 1941; presented to the American Mathematical Society, May 2, 1941. 
A summary of this paper appeared under the same title in Proceedings of the National 
Academy of Sciences, vol. 27(1941), pp. 309-311. 


59 








60 BARKLEY ROSSER 


clusive for vectors with real components. A‘ denotes the transpose of A. 
Vectors will be thought of as matrices of one column when convenient, so that 
the inner product of U and V can be written as the matrix product U’V. Hence 
the length of U (which we will denote by L(U)) is (U7U)'”, and the cosine of 
the angle between U’ and V is 
UY 
L(U)L(V) ° 


At other times it will be convenient to think of a vector as a point in n-dimen- 
sional space, namely, as the point whose coordinates are the components of the 
vector. Thus we can speak of a limit point of a set of vectors. Any finite 
sum of X,, --- , X, with integercoefficients, say z a;X;, will be called an 
I. L. C. (integral linear combination) of the X’s. 

As our first step in generalizing the Euclidean algorithm, we will generalize 
one of the problems which this algorithm solves, namely, the problem of finding 
a greatest common factor of two integers. This generalization will be much 
facilitated if we choose an appropriate definition of the G. C. F. of two integers. 
Among the many possible, we choose the following. 

fisaG.C. F. of a and b if and only if: 

1. there are integers m and n such that f = ma + nb, 

2. there are integers d, and dz such that a = dif and b = dof. 

As it stands, this is a definition of an integer f being a G. C. F. of two integers 
aand b. If we replace a, b, and f by z, y, and z, then we have a definition ofa 
real number z being a G. C. F. of two real numbers x and y. However, two 
real numbers, x and y, can be identified with two collinear vectors of lengths 
x and y, and conversely. Hence we consider replacing a, b, and f by X, Y, 
and Z, and then we have a definition of a vector Z being a G. C. F. of two 
vectors X and Y. By condition 2, X and Y would have to be collinear. 

It is clear that an arbitrary pair of collinear vectors need not have a G. C. F. 
and that a necessary and sufficient condition that they should is that they be 
commensurable (that is, have commensurable lengths). Here again, generaliza- 
tion is facilitated by an appropriate choice of a definition of ““commensurable’”’. 
In fact, there is available a definition in which the generalization is exactly as 
simple as the special case. We present the generalization. 

DEFINITION 1. A set of vectors V; , V2, --- , Vnissaid to be commensurable 
if and only if the set of I. L. C.’s of the V’s has no limit point. 

Here and later, n is not to be construed as having any connection with the 
dimensionality of the space in which the V’s lie. 

We now present the generalized definition of G. C. F. 

DEFINITION 2. A set of vectors U;,---, UmisaG.C. F. of a set of vectors 
Vi,--:, Vif and only if: 

1. each U is an I. L. C. of the V’s, 

2. each V is an I. L. C. of the U’s, 

3. the U’s are linearly independent. 














GENERALIZATION OF EUCLIDEAN ALGORITHM 61 


Note that conditions 1 and 2 of Definition 2 are strict generalizations of con- 
ditions 1 and 2 of the definition in the linear case, and condition 3 corresponds 
to the fact that in the linear case a G. C. F. consists of a single vector. 

We now undertake to generalize the Euclidean algorithm to a form which 
can be used to find a G. C. F. of a set of commensurable vectors. There are 
numerous minor variations of the Euclidean algorithm, and it will be helpful 
to start with an appropriate variation. We shall choose the variation known 
as the least remainder algorithm, and for reference shall explain how it would 
be applied to the problem of finding a G. C. F. of two collinear vectors. 

Let U, and V, be two commensurable, collinear vectors, not both zero. If 
U, = 0, then V,isaG.C. F. of U; and V,. Soassume U,; + 0. Since WU and 
V; are collinear, we can find a real x such that V; = xl’,. Choose m a nearest 
integer to x (clearly m is unique unless zx is halfway between two integers). 
Put Ue = Vi — mU,, Ve = Uy. Then Us = VV, — aU, + (x — m)\U, = 
0+ (x — m)U,. Since m is a nearest integer toz,|x — m|S 4. SoL(U2) Ss 
4L(U,). If Us ¥ 0, we can repeat the process, getting U; = V2 — nU2, V3 = 
U,, L(Us3) = $L(U:2). Moreover, we can continue to repeat the process as 
long as U, 0, and we will continue to have L(U,4:) S 4L(U,). Also each 
of Uns: and V,4: will be an I. L. C. of U, and V,. Note that conversely 
each of U,, and V, will be an I. L. C. of Uns; and V,4,;. From this it follows 
that, for every n, each of U, and V, is an I. L. C. of U; and V;, and each of 
U, and V; is an I. L. C. of U, and V,. Suppose there is never an n such that 
U, = 0. Then we would have an infinite succession of U’,’s, each no more 
than half as long as the preceding, and each an I. L. C. of U; and V,. This 
would mean that the origin is a limit point of the set of I. L. C.’s of U; and V;, 
which would contradict our assumption that U, and V; are commensurable. 
So there must be an n such that U,, = 0. Then V, is an I. L. C. of U; and Vi, 
and conversely each of U; and V; is an I. L. C. of Vz. So V, isa G. C. F. 
of U, and V;. 

We now generalize to two dimensions. Let U;, Vi , Wi be three commensur- 
able, coplanar vectors, not all zero. If two are zero, we are reduced to the 
case of two commensurable collinear vectors, which we have already discussed. 
So we assume that not more than one of ,, V;, and W,is zero. If U,,Vi, Wi 
are collinear, then find a G. C. F., X, of U; and V,. As this is an I. L. C. of 
U, and V,, any I. L. C. of X and W, is an I. L. C. of U1, Vi, and Wi. So 
X and W;, are commensurable. Also, X and W, are collinear so that we can 
find a G. C. F., Y, of X and W,. Y is clearly aG. C. F. of U1, Vi, and W,. 
Now we turn to the case where U,, Vi, and W, are not collinear. There is 
clearly no loss of generality in assuming L(U,) S L(Vi) S L(W). 

Case 1. UW, and V; are collinear. Find a G. C. F., X, of Uj and V,. Then 
X and W, are independent. Hence X and W, constitute a G. C. F. of Ui, 
Vi, and W,. 

Case 2. WU, and V; are not collinear. Then, since U;, Vi, W: are coplanar, 





62 BARKLEY ROSSER 


W, must be a linear combination of U; and V; , say Wi = 2U; + yVi. Choose 
m and n integers nearest to x and y respectively, and put W = W,; — mU; — nV. 
Then W = W, — 2U; — yVi + (@ — my + (y — ny), = (x — m)U, + 
(y — n)V;. Since Ul’; and V;, are not collinear, 


Li(x — m)U, + (y — n)Vi) < Life — m)U;) + L(y — n)Vi) 
< 4L(U)) +4L(V)) 
< 4L(V;) + 4L(¥V;). 


So L(W) < L(V:). Now take U2 to be the shorter of U; and W (if L(U;) = 
L(W), take U. = WU), V2 to be the other of U; and W, and W2to be V;. Then 
L(U2) + L(V2) + L(We) < L(U1) + L(Vi1) + L(W). Also L(U2) S L(V2) S 
L(W:). So if Us and V2 are not collinear, we can repeat the process. In fact, 
we can continue to repeat the process as long as l’, and V, are not collinear. 
Note that we will have L(Un41) + L(Vn4i) + L(Ways) < L(U,) + L(V.) + 
L(W,) at each step. Also, each of U,, V,, W, isan I. L. C. of U1, Vi, Wi, 
and conversely. Now suppose that we always have U, and V,, collinear. 
Then we get an infinite succession of U,’s, V,’s, and W,’s. Among all these 
there must be an infinite number of distinct points, since otherwise we could 
not have L(U nai) + L(Wns1) + L(Wass) < L(U,) + L(V.) + L(W,) for all 
n. Also each U,,, V,, or W, must have length = L(U,) + L(V,) + L(W,) S 
L(U;) + L(Vi) + L(W). Hence, we have an infinite number of distinct 
U,’s, V,’s, and W,’s lying within a circle of radius L(U,) + L(Vi) + L(W,) 
with center at the origin. So they must have a limit point. As they are all 
I. L. C.’s of Uy, Vi, and W,, this contradicts our assumption that U,, Vi, 
and W, are commensurable. So for some n, U, and V, are collinear. Then 
we can apply Case 1 and find a G. C. F. of U,, V,, W,, and this will be a 
G.C. F. of U1, Vi, Mi. 

A short digression is in order here. Just as the Euclidean algorithm can be 
applied to two incommensurable real numbers, yielding the continued fraction 
algorithm for approximating an irrational, so the algorithm just explained can 
be applied to three incommensurable vectors, and yields an algorithm for the 
simultaneous approximation of two irrationals. For instance, suppose we put 


Ui=(1,0, Wi=@0,1), M= W2, V3). 
Then W, = 1.41U; + 1.73Vi. So we take 
U. = Wi — U1 — 2V1 = (V2 — 1,73 — 2), Ve=U1, We = V1. 
Then W. = —3.73U2 + 1.55V2. So we take 
Us = We + 4U2 — 2V2 = (40/2 — 6,43 — 7), Vs = U2, Ws = Vo. 
Then W; = —2.20U; + .59V;. So we take 
U, = Ws + 2Us — Vs = (74/2 — 10, 70/3 — 12), Vi = Us, Wi = Vs. 





90S8e 
L Vi ° 
+ 











GENERALIZATION OF EUCLIDEAN ALGORITHM 63 


Then Wy, = —2.44U, — .49V,. So we take 
Us= Us, Vs = Wit 2U4 = (15/2 — 21, 15/3 — 26), Ws = Vi. 

Then W; = —.89U; — 2.03V;. So we take 

Us = Ws + Us + 2Vs = (41/2 — 58, 4410/3 — 71), Ve=Us, We= Vs. 
And so on. Note that L(U2) > L(U3) > L(Us) > L(Ue6). As the length of a 
vector is an upper bound on its components, there is a sense in which we can 
say that the vectors U,, U3, Us, Us bave shortening components. As the 
components have the form (ax/2 — b, ax/3 — c), we have here a scheme for 
choosing integers a, b, and c which make av/2 — b and av/3 — c simultaneously 
small. Dividing through by a, we see that we are simultaneously approximating 
4/2 and +/3 by means of fractions with the same denominator. For the instance 
at hand, we can say even more. U2, U3, Us, Vs, and Us all furnish values 
of a, b, and ¢ which satisfy 


=the... vo 
a av/, a a av, a 
Naturally this raises a number of questions, such as the following. 

Will the algorithm continue to give this degree of approximation indefinitely? 

Will the algorithm give a similar degree of approximation for other pairs of 
irrationals? 

What sorts of irrationals, if any, will cause the algorithm to repeat indefinitely? 

We will not attempt to answer these questions in this paper, but will proceed 
with our generalization. 

The proof that L(W) < L(V) which we give in the two dimensional case 
will obviously not generalize to any more dimensions. A possible way out of 
this difficulty is suggested by the observation that the W which we have chosen 
is not always the shortest possible W. For example, looking back at our 
numerical illustration, W; = —2.44U, — .49V;,, so that according to specifica- 
tions W should be W, + 2U,. However, both Wy + 2U, + Vs and Wy, + 3U, 
are shorter. In fact, each of these is shorter than U4, whereas W, + 2V, is 
only shorter than V,. Clearly it would improve the algorithm if we take W 
to be the shortest vector of the form W; — mU; — nV; at each step. Natu- 
rally this raises the question of how to determine m and n so as to minimize 
the length of W; — mV; — nU;. This question will be answered by Construc- 
tion 2 below, for which we now prepare the way. 

Construction 1. Given the independent vectors Vi, V2,---, V,, the 
arbitrary vector U, and the positive constant z, find all vectors, W = U — 
p is miV;, such that L(W) &S z. 

We shall give the construction by induction on n. That is, we shall first 
describe the construction for n = 1. Then assuming that the construction can 
be carried out for n, we shall show how to carry it out for n + 1. 

Let n = 1. The condition that V; constitutes an independent set of vectors 





64 BARKLEY ROSSER 


implies V; # 0, and hence L(V,) # 0. We wish all values of m such that 
L(U — mV) S 2, that is, all m’s such that (U — mV,)"(U — mV,) S 2 or 
UU — 2mU'V, + m’Vi'V; S 2. Plot the parabola y = U7U — 2xU7V;, + 
z’V,'V,. Then the values of x for which y S 2 form a finite closed interval 
(or perhaps a single point, or perhaps the null set), and we choose for m the 
integer values of x lying in that interval. 

Suppose we can perform the construction for n. Let Vi, ---, Vn, Vasi be 
independent. Let Ul’, and X- be the the projections of U’ and V,,4; on the linear 


manifold of V;, --- , V,, and put U, = U — Us: and X; = Vasi — X2. Then 


U, and X, are orthogonal to the linear manifold of V,,---,V,. IfW=U — 
> mV;, put Wi = Us — mayiXi and W. = Uz — mayiX2 — mVi— --- — 
m,V,. Then W = W,+ We, and W,; and W, are orthogonal. Hence 
[L(W)? = [L(W))P + [L(W:.)f. By the construction for n = 1, find all values 
of Many, such that L(Wy) = L(Uy — mauiX1) S z. For each such m,4;, use 
the construction for n to find all sets of values for m;,--- , m, which make 
L(W.) = L((Us — magiX2) — mV, — «++ — mV.) S (e — [L(W) fF)”. 
THeoreM 1. Jf Vi,---, Vn are independent, then there are only a finite 


number of vectors, W = U — >> miV;, for which L(W) S z. 
Proof. Observe that Construction 1, which yields all such W’s, yields only 


a finite number. 


TuHeoreM 2. If V,,--- , V, are independent, then there is a shortest non-zero 
vector, W = U — > miV;. 
Proof. U and U — V;, are not both zero. Hence there is a non-zero W. 


Choose a non-zero W. By Theorem 1, there are only a finite number of shorter 
W’s, and so there is a shortest non-zero W. 

Note that it is not claimed that there is a unique shortest non-zero W. We 
shall see later that it is quite possible to have several shortest non-zero W’s (all 
the same length, of course). 

Construction 2. Given the independent vectors V; ,---, V,, and an 
arbitrary vector U, find a shortest non-zero vector, W = U — >> mV;. 

The construction is indicated in the proof of Theorem 2. We will, however, 
make some suggestions for abridging the computations. Note that the method 
consists of finding a non-zero W, and then finding all shorter W’s. The shorter 
our original W is, the fewer shorter W’s there are, and so the less labor to find 
all of them. This suggests that as shorter W’s are found, their lengths be used 
as new upper bounds instead of the lengths or the earlier and longer W’s. Also 
when a non-zero W has been found, and we start in to find all shorter W’s, we 
should try the shortest W, first (see the latter part of the instructions for Con- 
struction 1). Recall that Wy = Ui — m,4.:X1, and that the (m,,4:)’s are de- 
termined by the construction for n = 1, which is such that it is easy to choose 
the W,’s in order of increasing length. 

A word of caution is desirable. One might suppose that if U = 2V, + yVe, 





CTO 


an 








GENERALIZATION OF EUCLIDEAN ALGORITHM 65 


then the minimum length for U — mV, — nVz2 would be attained with an m 
and n which are close to x and y respectively. Such is not always the case. 
Let U = (2, 2), Vi = (47, 7), V2 = (13, 2). Then 


U = —7iV; + 262V2 . 
Nevertheless the shortest vector of the form U — mV, — nVz is obtained by 
putting m = —11, n = 40. For more dimensions, the situation is obviously 


no better. 

We now return to the question of generalizing the algorithm which we eluci- 
dated for the case of three commensurable coplanar vectors. It was suggested 
that if we modify that algorithm by taking W to be the shortest vector of the 
form W, — mU, — nV,, the resulting algorithm can be generalized to more 
dimensions without further modification. This is the case, and the next two 
theorems supply the information needed to show that the generalization is 
effective for three and four dimensions. 


THEeorEM 3. Let U, V, W, X be dependent vectors. Let L(U) S L(V) 
L(W). Let U, V, W be independent. Let Y be a shortest vector of the form X — 
mU — nV — pW. Then L(Y) < L(W). 


The proof is just like the proof of the next theorem. 


TuHeoreM 4. Let U', V, W, X, Y be dependent vectors. Let L(U) S L(V) S 
L(W) s L(X). Let U, V, W, X be independent. Let Z be a shortest vector of 
the form Y — mU — nV — pW — qX. Then L(Z) < L(X) except in the special 
case where U, V, W, X are mutually orthogonal and all the same length, and Z = 
+3U +3V+3W + 3X. 

Note. In the exceptional case, U', V, W, and Z would constitute a G. C. F. 
of U, V, W, X, Y. Hence, the exceptional case merely provides an additional 
way in which the algorithm may terminate in the case of four dimensions. 

Proof. Choose a coordinate system in which we have 

U = (uw, 0, 0, 0, 0, --- , 0), 
V = (v1 , v2, 0, 0, 0, poe , 0), 


W = (wi ; We, w3; 0, 0, ste , 0), 


IA 


zx = (ay » 12, %3,%, 0, wake 0), 


— 
‘ 
II 


(yi, Y2, Ys, ys, 0, --- , O). 


Then |u| = L(U), |m| S& L(V), | ws| S L(W), and | a| = L(X). More- 
= L(V) if and only if v; = 0, | ws | = L(W) if and only if w; = w. =0, 








over, | v2 


and | 2,| = L(X) if and only if x; = z2 = x3; = 0. Now choose g so that the 
fourth component of Y — qX is numerically S | 2;/2|. Then choose p so that 
the third component of Y — pW — qX is numerically < | w;/2|. Then choose 
nso that the second component of Y — nV — pW — qX is numerically S | v2/2 |. 
Finally choose m so that the first component of Y — mU — nV — pW — qX 





66 BARKLEY ROSSER 


is numerically < | u,/2|. Denote Y — mU — nV — pW — qX by R. Then 


L(R) is the square root of the sum of the squares of the components of R, and 
hence 


L(R) 


IA 


2 2 2 2\ 1/2 
U1 ve W3 v4 
(3 +o+24+4). 


L(R) S 4({L(U)P + (L(V) + (LOW) + (L(x). 
However, L(U) = L(V) Ss L(W) S L(X). So L(R) S L(X). Furthermore, 


Z is a shortest vector of the form Y — mU — nV — pW — qX. SoL(Z) Ss 
L(R). So L(Z) Ss L(X). Now we need to consider the circumstances under 
which we can have L(Z) = L(X). Clearly this can happen only when 
L(R) = L(X). This latter can happen only when L(U) = L(V) = L(W) = 
L(X) = |u| = | ve| = | ws! = | a4|. So we must have 1 = w, = w= 1= 
ta = x3 = 0. So U, V, W, X are mutually perpendicular and all the same 
length. Moreover, the components of R must each have absolute value equal 
to 4L(U). Hence R = +4U +4V 4+43W + 3X. Under these circumstances, 
any shortest vector (and hence Z) must have the same form as R. 

The generalization of Theorem 5 to five dimensions is false, as one can see 


by putting 


U = (1, 0, 0, 0, 0), 
V = (0, 1, 0, 0, 0), 
W = (0, 0, 1, 0, 0), 
X = (0, 0, 0, 1, 0), 
Y = (0, 0, 0, 0, 1), 


Z = (4, Yr, Pr Yr, Yr)- 
Clearly Z is the shortest vector of the form Z — mU — nV — pW — 
qX — rY. However Y is not the shortest vector of the form Y — ml’ — nV — 
pW — qX — rZ. In fact, if we pu.R = Y+U+V+W+X — 22, then 
L(R) < L(Y). Moreover U, V, W, X, R constitute a G. C. F. for U, V, W, 
X,Y,Z. This suggests a further modification of the algorithm for five or more 
dimensions. Instead of invariably trying to shorten the longest vector by sub- 
tracting an I. L. C. of the other vectors, try in turn to shorten each vector in 
this fashion. Whether this modification will restore the effectiveness of the 
algorithm for five dimensions, I do not know. That even this modification 
ean fail for a large number of dimensions is shown by the following example. 
For i = 1, 2,---, 26, let V; be the vector with twenty-six components, of 
which the i-th is unity and the rest are zero. Let V2 be the vector with twenty- 
six components, all equal to 2/5. Clearly, no one of the twenty-seven V’s can 








en 


ind 








GENERALIZATION OF EUCLIDEAN ALGORITHM 67 


be shortened by subtracting an I. L. C. of the other vectors. Nevertheless, 
they have a G. C. F., namely, Vi, --+ , Vos and 3Ve. 

In spite of the failure of the algorithm for a large number of dimensions, we 
have the following theorem. 


THEOREM 5. Any n commensurable vectors, not all zero, have a G. C. F. 


The proof is by strong induction on n. When n = 1, matters are simple. 
Assume the theorem true for n or less vectors and let us have n + 1 vectors, 
Vi,-::,Vn, Vans. If they are independent, they themselves constitute their 
G. C. F. Now let one of them, V,4; (say), be dependent on the rest. If 
Vi,--:, Vn are dependent, they have a G. C. F., Ui, +--+, Us, with s < n. 
Also any I. L. C. of V4; and the U’s is an I. L. C. of the V’s. So Vas; and the 
U’s are commensurable. As they are at most » in number, they have a G. C. F., 
which isa G. C.F. of the V’s. So we have remaining the case where V;, --- ,Va 
are independent, and V,4; is dependent on V;,---, V,. Then choose co- 
ordinates in which the V’s have components as indicated: 


Vi = (vn , 0, 0, 0, --- , 0, 0, --- , 0), 
V2 = (v2 , Ve , O, 0, hth , 0, 0, -++ , 0), 
’3 = (v1 , Use, V33, 0, --- , 0,0, --- , 0), 


ee 


_ 
‘ 
o 
| 


(Un1 » Un2 » Uns » Und» *** » Unn sy 0, iin aie , 0), 
Vins = (wy, . We , W3 , W4 , eee P Wn ‘ 0, eee , 0). 


Then v,, and w, are commensurable. For, suppose they are not. Choose sets 
of integers m;, n; such that mw,, + nw, converge to zero. Consider 
X; = MVan + NiVangi + GaVi + Gi2Ve 4+ GinrVn1. By properly choosing 
first @jn-1, then @jn-2,--- , and finally aa , we can make the j-th component 
of X; less in absolute value than v;; forl Sj S n— 1. As the n-th compo- 
nents of the X’s are all different, we have an infinite number of X’s in a bounded 
region, and so the X’s have a limit point, contradicting the commensurability 
of the V’s. Hence there is a positive number z and integers a, b, c, and d such 
that ann + bw, = 2, cZ = Un, ,dz = w,. Then ac + bd = 1 as we can see by 
substituting the expressions for v,, and w, into the expression for z. Now de- 
fine ¥; = dVn — CVngi, Yo = Vn + bVanyi. Then bY; + c¥2 = V, and 
d¥, — a¥Y; = Vayi. Soany G. C.F. of Vi, +--+, Var, Y1, YeisaG. C.F. 
of the V’s, and conversely. Now the n vectors V;,---, Va, Yi have a 
G. C. F., U1,---, Us. Since the n-th components of Vi,--:, Vai, Yi 
are all zero, the same is true of the U’s. Hence Y2 and the U’s form an inde- 
pendent set of vectors, and hence constitute a G. C. F. of the V’s. 

Note that if one follows the proof of Theorem 5, one can describe a construc- 
tive procedure for finding a G. C. F. of a set of commensurable vectors. How- 
ever, it is not one that can be recommended for ease of application, and a person 
would do well to try modifications of the algorithms described in this paper 





68 BARKLEY ROSSER 


(especially the one which is still to be described) as long as they are effective, 
and save the procedure based on the proof of Theorem 5 as a last resort. 

For completeness, we digress long enough to indicate that the converse of 
Theorem 5 is true. Suppose V,,---, V, havea G. C.F. U,,---, U,, but 
the V’s are incommensurable. As every I. L. C. of the V’s is an I. L. C. of 
the U’s, the U’s are incommensurable. So a sequence of I. L. C.’s of the U’s, 
X,, X2,--+- , approaches a limit X. So if one chooses a positive ¢, there are 


2? 


an infinite number of vectors X — X; with L(X — X,) < «. This contradicts 
Theorem 1. 

We pause for orientation. In one dimension a G. C. F. of two collinear com- 
mensurable non-zero vectors is a shortest I. L. C. of the two vectors, and con- 
versely. However, in more dimensions, a G. C. F. of a set of vectors has not 
the least connection with a shortest I. L. C. of the vectors. (To see this, note 
that if U, , U.constituteaG. C.F. of Vi, --- ,V,,then aU, + bU2 ,cU, + dU 2 
also constitute a G. C. F. if and only if ad — be = 1.) In the remainder of the 
paper we shall be concerned with another generalization of the Euclidean 
algorithm, which will solve the problem of finding a shortest non-zero I. L.C. 
of n independent vectors if n < 4. We note that if Vi,---, V, are inde- 
pendent and we take U’ = 0, then Construction 2 solves the problem of finding 
a shortest non-zero I. L. C. of the V’s. However, Construction 2 is not a 
generalization of the Euclidean algorithm. Moreover, and this is vital, Con- 
struction 2 is much more laborious to apply than the algorithm which we will 
present. 

We shall have more applications for our algorithm, and will not increase the 
technical difficulties in the least, if we use a generalized definition of length. 
Let A be a positive definite symmetric matrix." We define the length (length 
relative to A) of U to be (U7AU)'”. This length will be denoted by L(U) or 
L.4(U). When A is the unit matrix, this reduces to the customary definition of 
length. Note that L(l’) 2 0, equality occurring if and only if U = 0. 

Consider the problem of finding a shortest vector of the form U — nV. For 
this purpose, we seek to minimize (U’ — nV)"A(U — nV), which equals U7AU — 
2QnU7AV + n°V'AV. By plotting the parabola y = U7AU — 2xU7AV + 
a V'AV, we see that we must take n a nearest integer to (U7AV)/(V’AV). 
This amounts to choosing an n which minimizes | U7AV — nV"AV |. In other 
words, the n’s which minimize the length of l’ — nV are just the quotients 
which one would use to get a least numerical remainder upon dividing UAV 
by V’AV. 

The operation of replacing LU’ by a shortest vector of the form U — nV is 
clearly a generalization of the basic step in the one dimensional Euclidean algo- 
rithm. It will be our fundamental step, so we give it a name in Definition 3 
(below). However, it alone is not quite adequate, and the other step which 
we use is given a name in Definition 4 (below). 

1See M. Bécher, Introduction to Higher Algebra, p. 150. By positive definite we shall 
mean what Bécher would call positive definite and non-singular. So by the corollary on 
p. 153 of Bécher, U7 AU = Oif and only if U = 0. 





o~ 


sh 


ve, 








GENERALIZATION OF EUCLIDEAN ALGORITHM 69 


DEFINITION 3. If some vector of the form U’ — nV is shorter than U, and if 
we replace U’ by a shortest vector of the form Ul’ — nV, we say that we mini- 
mize U’ by integral use of V. 

DerIniTIon 4. If some vector of the form U + V; + V2 +--- + V, is 
shorter than U’, and if we replace Ll’ by a shortest vector of the form 
U + V,; + Ve +---+ V,, we say that we minimize Ul’ by unit use of 
Vi, V2,--:, Van. 

We are now ready to describe the algorithm. The algorithms for two, three, 
and four dimensions consist of repeated applications of Operations II, III, and 
IV, which we define below. 

DEFINITION 5. Operation II is an operation on two vectors, and consists of 
the following. First, arrange the two vectors in order of increasing length. 
Let them then be U and V, so that L(U) < L(V). Then, minimize V by 
integral use of LU’. 

Note that one cannot perform Operation II on just any pair of vectors. 
Notice further that when Operation II can be performed on two vectors, the 
result is a new pair of vectors, one of which is just the shorter of the two original 
vectors. Note also that each new vector is an I. L. C. of the old vectors and 
that each old vector is an I. L. C. of the new vectors. 

Similar remarks will apply to Operations III and IV. 

DEFINITION 6. Operation III is an operation on three vectors and consists 
of the following. First, arrange the three vectors in order of increasing length. 
Let them then be U, V, W, so that L(U) S L(V) S L(W). Then perform 
the first of the following three operations which is possible. 

OPERATION IITA. Minimize one of V or W by integral use of U’. 

OPERATION IIIB. Minimize W by integral use of V. 

OPERATION IIIC. Minimize W by unit use of U and V. 

DEFINITION 7. Operation IV is an operation on four vectors and consists of 
the following. First, arrange the vectors in order of increasing length. Let 
them then be U, V, W, X, so that L(U) = L(V) S L(W) S L(X). Then, 
perform the first of the following seven operations which is possible. 

OPERATION IVA. Minimize one of V, W, or X by integral use of U. 

OPERATION IVB. Minimize one of W or X by integral use of V. 

OpeRATION IVC. Minimize X by integral use of W. 

OPERATION IVD. Minimize one of W or X by unit use of U and V. 

OpeRATION IVE. Minimize X by unit use of U and W. 

OpERATION IVF. Minimize X by unit use of V and W. 

OPERATION IVG. Minimize X by unit use of U, V, and W. 

DEFINITION 8. We shall say that Vi , V2, --- , V» , in the order named, is a 
minimal G. C. F. of U,, Us, ---, Um if the V’s constitute a G. C. F. of the 
U’s and if moreover: 

1. Of all non-zero I. L. C.’s of the U’s, V; is a shortest. 

2. Of all I. L. C.’s of the U’s which are independent of V; , V2 is a shortest. 

3. Of all I. L. C.’s of the U’s which are independent of V; and V2, V3; is a 
shortest. 


~) 





70 BARKLEY ROSSER 


And so on up to n. 
In five or more dimensions, a set of vectors can have a G. C. F. without 
having a minimal G. C.F. To see this consider the vectors 


U = (1, 0, 0, 0, 0), 
V = (0, 1, 0, 0, 0), 
W = (0, 0, 1, 0, 0), 
X = (0, 0, 0, 1, 0), 
" = (0, 0, 0, 0, 1), 


under the usual definition of length. The last and any four of the first five will 
constitute a G. C. F., but there is no minimal G. C. F. On the other hand, any 
commensurable set of vectors in four or less dimensions will have a minimal 
G. C.F. We will show that, given a G. C. F. in four or less dimensions, one 
can derive a minimal G. C. F. from it by successive applications of Operations 
II, II, or IV. 

We first need some theorems concerning quadratic forms, so we let f(z, y), 
f(x, y, z), «++ denote quadratic forms in two, three, --- variables. Also, it will 
be necessary to break the argument up into cases according as the various 
variables are positive or negative. In order to do this expeditiously, we will 
assume the variables always non-negative and multiply x by e:, y by @, ---, 
where each of e; , @2 , --+ can be either 1 or —1. 


THEOREM 6. Let 


(1) 0 < f(1, 0) = f(0, 1), 

(2) f(0, 1) S fler, e2). 

Let x and y not both be zero, and let a be the least non-zero one of x and y. Then 
(3) a'f(1, 0) S flex, exy), 

and if y # 0, then 

(4) a'f(0, 1) S flew, exy). 


” 


Moreover if neither of x and y is zero and if x and y are unequal, then the “Ss 
in (4) can be replaced by “‘<’’. 

Proof. Clearly, if y = 0, (3) is true. If y ¥ 0, then (3) follows from (4) 
and (1). So we undertake to prove (4). If z« = 0, (4) is clearly true. So let 
x # Oandy + 0. 

Casel. x2y. If g(x, y) is a quadratic form, then 
g(x, y) = g(1, 1y* + g(1, 0)(x — y)* + g(i, O)y(x — y) 

+ {g(1, 1) — 90, I)}y(@ — y). 





—~ 


ut 


ven 


(4) 
let 


y). 











GENERALIZATION OF EUCLIDEAN ALGORITHM 71 
To see this, put g(z, y) = wx” + uery + usy’ and multiply out the right sideof 
the equation. Putting g(z, y) = flex, ey), we get 
flex, ey) = fler, er)y* + fler, 0)(« — y)’ + flar, O)y(x — y) 
+ {f(er, 2) — f(O, e2)}y(x — y). 
Now by (2) 
fler, ey’ = fO, I)y* = FO, Ia’. 
Also 
f(a, 0) = f(1, 0) > 0. 
Also by (2), 
Sle: , &2) — f(0, e2) = f(a, &) — f(0, 1) 2 0. 
So 
flexx, exy) 2 f(0, 1)a’, 
and if z ¥ y, 
flex, ey) > f(0, 1a’. 
Case 2. x S y. In this case, put g(y, x) = flat, ey). So 
flax, ey) = fla, eo)x’ + f(0, e2)(y — x) + f(0, e2)a(y — x) 
+ {f(er, e2) — f(a, 0)}a(y — 2). 
However, by (2), 
fle: , &) = f(0, 1) 
and by (1), 
f(0, 1) = f(1, 0) = fla, 0). 


fler, e2) — fla, 0) = 0. 
So 
flex, ey) = f(O, ia’, 
and if z ¥ y, 
flex, ey) > f(0, 1)a’. 
THEOREM 7. Let 


(1) 0 < f(1, 0, 0) = f(0, 1, 0) = f(0, 0, 1), 





BARKLEY ROSSER 


f(0, a 0) s Sle » C2, 0), 


(3) f(0, 0,1) S fle. , 0, es), 

(4) f(0, 0, 1) S f(O, ez , es), 

(5) f(0, 0,1) S fler, e2, es). 

Let x, y, and z not all be zero, and let « be the least non-zero one of x,y, andz. Then 
(6) a’f(1, 0,0) < flex, esy, ez), 

and if y # 0, 

(7) a’ f(0, 1,0) S fle, ery, ez), 

and if z # 0, 

(8) a’ f(0, 0,1) S flew, esy, esz). 


Moreover, if none of x, y, or z is zero, and x, y, and z are not all equal, then the 
“<=” in (8) can be replaced by a ‘‘<”’ 


Proof. lf y and z are both zero, then (6) holds. If y # 0, then (6) follows 
from (7) and (1). If z ¥ 0, then (6) follows from (8) and (1). So we turn to 
(7) and (8). If z = 0, we may consider f(z, y, z) as a quadratic form in x and y 
only, and so (7) follows by Theorem 6. If z # 0, then (7) follows from (8) 
and (1). So we have only to prove (8). If either of x or y is zero, we may 
consider f(x, y, z) as a quadratic form in the other two variables and apply 
Theorem 6. So let x # 0, y # 0,2 # 0. 

Casel. rx=y22z. Let g(x, z) = f(ert, eer, es2) = f(err, ery, esz). Then 
flerx, exy, esz) = fler, er, es)2 + fler, e, O)(x — z)’ + fier, e2, O)z(x — 2) 

+ {f(er, e2, es) — f(0, 0, es) }z(x — 2). 
So by (5) and (2), 


flex, exy, e392) 2 f(0, 0, l)a’. 


Case 2 and Case 3 are x = z 2 yand y = z 2 g, and are handled similarly. 
4. x >y2z. If g(x, y, z) is a quadratic form, then 


Case 

g(x, y, 2) = gly, y, 2) + {g(1, 1, 0) — g(0, 1, O)}y(x — y) 

+ {g(1, 0, 1) — g(0, 0, 1)}z(@z — y) 

+ g(1, 0, 0)(x — y)(y — z) + g(1, 0, 0)(« — y)’. 
If we put g(x, y, z) = flex, exy, sz), we get 
Slerx, exy, es2) = flery, exy, es2) + {f(er, e, 0) — flO, e2 , O)}y(a — y) 
+ {fle , 0, es) — f(O, 0, es) }z(a — y) + fle, 0, O)(a — y)(y — 2) 
+ fer, 0, O)(e — y)’. 















en 








GENERALIZATION OF EUCLIDEAN ALGORITHM 73 


By Case 1, flay, ey, esz) 2 f(0, 0, 1)a’. 
a 


Cases 5-9, namely, y > x 


z2>2z22y2>x22y,y > z2 2, and 


z > y 2 « are handled similarly. 


THEOREM 8. Let 


(1) 


(9) 

(10) 
(11) 
(12) 


0 < f(1, 0, 0,0) < f(O, 1, 0, 0) < f(0, 0, 1, 0) S f(0, 0, 0, 1), 
f(0, 1, 0, 0) < fler, e , 0, 0), 
f(0, 0, 1,0) < fle: , 0, es , 0), 
f(0, 0, 1,0) S f(O, e2 , es , 0), 
f(0, 0, 1, 0) < fler, e, és, 0), 
(0, 0, 0, 1) S fler, 0, 0, es), 
f(0, 0, 0, 1) S f(O, es , 0, e), 
f(0, 0, 0, 1) < f(O, 0, es, es), 
f(0, 0, 0,1) S fla, e, O, es), 
f(0, 0,0, 1) S fler, 0, es, e), 
f(0, 0, 0, 1) S f(O, e2 , es, e), 
f(0, 0,0, 1) S fler, 2, es, es). 


Let x, y, z, and w not all be zero, and let a be the least non-zero one of x, y,z,and w. 
If some of x, y, z, w are not integers, put 8B = 3a°/4. If all of x, y, z, ware integers, 
put 8 = max (1, 3a°/4). Then 


(13) 


Bf(1, 0, 0, 0) S flex, ery, esz, eaw), 


and if y ~ 0, 


(14) 


6f(0, 1,0, 0) S flew, ery, esz, egw), 


and if z # 0, 


(15) 


Bf(0, 0, 1,0) S flerz, ery, esz, ew), 


and if w ~ 0, 


(16) 


Bf(0, 0,0, 1) S flex, esy, esz, exw). 


Moreover, if none of x, y, 2, or w is zero, and if there are at least three different 
values among the x, y, z, and w, then the “‘S”’ in (16) can be replaced by a “‘<”’. 


Proof. 


As in the proof of Theorem 7, we may confine our attention to the 


case where x ¥ 0, y ¥ 0, z ¥ 0, w ¥ 0, and we are seeking to prove (16). We 
first prove a lemma. 

Lemma. If conditions (1)—(12) are satisfied, and none of x, y, z, or w 1s zero, 
and if the two largest of x, y, z, and w are equal and if a is the value of the smallest 
of x, y, z, and w, then 








BARKLEY ROSSER 





a f(0,0,0,1) < Sler.x, eoy, 32, egw). 
Moreover, if the x, y, z, and w are not all equal, one can replace the ‘<”’ bya“ <’’. 
Casel. r=y=z2w. Put 
g(x, w) = flex, cox, esx, egw) = flex, easy, es2z, eqw). 
Then 
flex, exy, ez, egw) 
= filer, er, 3, ew’ + fer, 2, €3, O)(z — w)” + fler, e2, 3 , O)w(z —w) 
+ {f(er, e2, es, es) — f(0, 0, 0, es)}w(x — w) = f(0, 0, 0, 1)a’. 


Cases 2,3, and 4, namely, = y=w22,2=2=w2y,andy=2z= 
w 2 x proceed similarly. 

Case 5. tr = y > 2z 2 w. Put g(x, z, w) = fle, eer, e3z, ew) = 
Slerx, exy, esz, egw). Then 


Slerx, ery, es2, exw) = f(erz, 22, €3z, egw) 
+ {f(er, e2, es, 0) — f(0, 0, es , O)}z(a — 2) 
+ {fler, e2 , 0, es) — S(O, 0, 0, es) } w(x — 2) 
+ f(er , e2 ,0,0)(x — z)(z — w) + fler , e2 , 0, 0)(a — z)’ 
> f(0, 0, 0, 1)a” by Case 1. 


Cases 6-16, namely, = z>y2u,y=z>r2u,r=y>ow2z, 
zr=wroy2zzyr=wrr22446r~=-wrez2yr=z>w2zyw=2z> 
r2zyyrzz2>w2z¢ryrwr>2z222%,2=w> y 2 zx proceed similarly. 


This completes the proof of the lemma. We return to the proof of the main 
theorem, noting that the lemma disposes of all cases except where the two 
largest of x, y, z, and w are unequal. 

Casel. x >y2z22w. If g(a, y, z, w) is a quadratic form, then 


g(x, y, 2, w) = gly, y, 2, w) + {g(1, 1, 0, 0) — g(0, 1, 0, 0)}y(a — y) 

+ {g(1, 0, 1,0) — g(O, 0, 1, 0)}2(x — y) 

+ {g(1, 0, 0, 1) — g(0, 0, 0, 1)}w(a — y) 

+ g(1, 0, 0, 0)(x — y)(y — 2) + g(1, 0, 0, 0)(@ — y — w)(a — y). 

Putting g(x, y, z, w) = flex, ery, ez, egw), we get 
flex, exy, es2, exw) = flery, e2xy, sz, exw) 
+ {f(er, & , 0,0) — f(0, & , 0, O)}y(a — y) 

+ {fer , 0, es, 0) — f(O, 0, es , O)}z(x — y) 





















GENERALIZATION OF EUCLIDEAN ALGORITHM 75 


+ {fer , 0, 0, e) — f(0, 0, 0, es)}w(x — y) 
+ f(er , 0, 0, O)(z — y)(y — 2) 
+ fle: , 0, 0, 0)(a — y — w)(x — y). 
By the lemma this 2 
f(0, 0, 0, 1)w* + f(1, 0, 0, 0)(2 — y — w)(x — y). 


Also, if there are at least three different values among z, y, z, and w, then y, z, 
and w are not all equal, and we may replace the ‘=’? bya ‘“‘>”. Ifa —y2w, 
our theorem is proved. So let x — y < w. Now f(1, 0, 0,0) < f(0, 0, 0, 1). 

. . 2 
Also (w — u)u takes its maximum at u = }w, and so we get (w — u)u S }w’. 


So 
(w — (x — y))(« — y) S Ww’. 
So 
(0, 0, 0, 1)w* + f(1, 0, 0, O)(e — y — wx — y) 
> f(0, 0, 0, 1)w” — f(0, 0, 0, 1)4w” 
3a°f(0, 0, 0, 1). 


IV 


If some of x, y, 2, w are not integers, then 8 = 3a°/4, and we have proved 
S(erx, exy, es, eew) = Bf(0, 0, 0, 1). 
Suppose all of x, y, z, w are integers. As before, we have 
(w — (x — y))(x@ — y) S jw’. 
So 
(w— (x — y))(e@-—y) <w. 
Since we are now dealing with integers, 
(w— (x —y))(ex-— y) Sw - 1. 
So 
(w — (x — y))( — y) S min (w’ — 1, iw’). 
So 
f(0, 0, 0, 1)w* + f(1, 0, 0, 0)(2 — y — w)(x — y) 
= f(0, 0, 0, 1)w* — f(0, 0, 0, 1) min (w* — 1, }w”) 
> f(0, 0, 0, 1) max (1, 3w’). 


The remaining cases of the theorem proceed similarly. 
Note one point about the preceding three theorems. As they are stated (and 





76 BARKLEY ROSSER 


proved), é: , €2, *+ could denote a particular choice of signs for xz, y,---. If 
so, then the theorem says that if the hypothesis is true for a particular choice 
of signs, then the conclusion is true for the same choice of signs. From this it 
clearly follows that if the hypothesis is true for all possible choices (as will be 
the case in the application), then the conclusion is likewise true for all choices. 

Now suppose U,, V; is a G. C. F. of Ri, ---, Rn, and suppose one can 
apply Operation II to U; , Vi, getting U2, V2. Then U2, VoisalsoaG.C.F. 
of R,,---,R,». Also it is clear from the nature of Operation II, that 
L(U.) + L(V2) < L(U;) + L(Vi). Suppose now that Operation II can be 
performed on U:, V2 also, giving U3, V3. Then U;, Vs is a G. C. F. of 
Ri, ---,Rm and L(U3) + L(V3) < L(U2) + L(V2). Can we continue to 
perform Operation II indefinitely? No, for the following reasons. Since 
Ri, --- ,R» have aG.C. F., they are commensurable (this can easily be proved 
for the generalized length by noting that, since A = P’P, then [L4(U)}’ = 
U’AU = U'P’PU = (PU)*(PU) = the square of the ordinary length of PU. 
In other words, the usual definition of length is restored by transforming by 
use of P). If there are an infinite succession of G. C. F.’s of the R’s, namely 
U,, V,» for n = 1, 2,--- , having the property that L(Uns:1) + L(Vaus) < 
L(U,) + L(V,), then there would have to be an infinite number of different 
I. L. C.’s of the R’s in a bounded region. These would have to have a limit 
point, contrary to the fact that the R’s are commensurable. Hence if we start 
with a G. C. F. of the R’s and keep performing Operation II successively, we 
will come at last to a G. C. F. of the R’s on which Operation II cannot be per- 
formed. When the vectors of this are arranged in order of increasing length, 
they will constitute a minimal G. C. F. of the R’s, as we prove in the theorem 
below. 

THEeorEM 9. Suppose U, V isaG.C.F.of Ri, ---,Rn,and L(U) S L(V), 
and Operation II cannot be performed on U,V. Then U, V isaminimal G.C.F. 
of Ri, +++ , Rm 

Proof. Since U,V isa G.C. F., U # 0. Hence 0 < L(U). Also, since 


Operation II cannot be performed, L(V) = L(V — nU). In particular, 
L(V) sL(VtaU). As L(W) = L(-W), L(V) S$ L(+ VtUv). So 

(1) 0 < (L(V) = [L(/)S, 

(2) (L(V)? s [L(+ V + U)f. 


Now put 
f(x, y) = 2U"AU + 2xyU"AV + y’V"AV 
= (eU + yV)"A(2U + yV) = [L(aU + yV)P. 
Then (1) and (2) become 
0 < f(1, 0) = f(, 1), 
f(0, 1) S fle, e2), 














GENERALIZATION OF EUCLIDEAN ALGORITHM 77 


where the second condition holds for all possible determinations of the e’s. So 
we have the hypothesis of Theorem 6 satisfied. Now any I. L. C. of the R’s 
isan I. L. C. of U, V. So let aU + bV be a shortest non-zero I. L. C. of the 
R’s. Then [L(aU + bV)]° = f(a, b). By Theorem 6, f(a, 6) = f(1, 0). As 
f(1, 0) = [L(U)}, we have L(aU + bV) = L(U). But aU + bY is a shortest 
non-zero I. L. C. of the R’s, and U is a non-zero I. L. C. of the R’s. SoU isa 
shortest non-zero I. L. C. of the R’s. Now let cl + dV be a shortest I. L. C. 
of the R’s which is independent of U. Then d # 0. So by Theorem 6, 
[L(cU + aV)P = fic, d) = f(0, 1) = [L(V)f. So V is a shortest I. L. C. of 
the R’s which is independent of U. So U, V is a minimal G. C. F. of the R’s. 

Suppose U, V, W isaG.C. F. of a set of vectors R, , --- , Rm and one suc- 
cessively performs Operation III on U, V, W. By an argument similar to that 
used above, there must come a time when Operation III can no longer be per- 
formed. If we then arrange the three resultant vectors in order of increasing 
length, they will constitute a minimal G. C. F. of the R’s, as the following 
theorem shows. 


THEOREM 10. Suppose U,V, W isaG.C.F. of Ri,---,Rn,and L(U) Ss 
L(V) = L(W), and Operation III cannot be performed on U, V, W. Then U, 
V, W isa minimal G. C.F. of Ri, ---,Rn. 

The proof is similar to the proof of Theorem 9. If we put f(z, y, z) = 
[L(xU + yV + 2W)], then 0 < L(U) S L(V) S L(W), and this together with 
the inequalities derivable from the fact that Operation III cannot be performed 
insure that the hypothesis of Theorem 7 is satisfied. The conclusion of Theorem 
7 can be used as the conclusion of Theorem 6 was used in the proof of Theorem 9, 
to show that l’, V, W is a minimal G. C. F. 

If U, V, W, X isaG.C. F. of Ri, --- ,Rm, then one can find a minimal 
G. C. F. of the R’s by use of Operation IV, as the following theorem shows. 

THEOREM 11. SupposeU,V,W,X isaG.C.F.of Ri, ---,Rn,andL(U) Ss 
L(V) s L(W) S L(X), and Operation IV cannot be performed on U, V, W, X. 
Then U, V, W, X isa minimal G. C.F. of Ri, «++, Rn. 

The proof is similar to the proof of Theorem 9. 

We have now shown that if one has a G. C. F. in four or less dimensions, then 
one can find a minimal G. C. F. by use of Operations II, III,and IV. The 
question of how to find a G. C. F. in the first place is relatively unimportant, 
for in all the applications that I have found for the algorithm so far, either one 
has a G. C. F. given to start with, or else there is noG. C. F. However, it 
could be shown that if one should start with three dependent commensurable 
vectors, not all zero, and apply Operation III, one would eventually find a 
G. C.F. Also, it could be shown that if one should start with four dependent 
commensurable vectors, not all zero, and apply Operation IV, one would eventu- 
ally finda G.C.F. Whether one could find a G. C. F. for five dependent com- 
mensurable vectors by use of an “Operation V”’ obtained by strict generaliza- 
tion of Operations II, III, and IV, I do not know. Certainly nothing would be 








78 BARKLEY ROSSER 


lost by performing such an ‘Operation V”’ on the five vectors as long as it 
could be performed, and saving the more complicated earlier methods (see 
Theorem 5) as a last resort. 

We now consider a number of applications. 

ProsieM I. Let us ask for the minimum (different from zero, of course) of 
the positive definite quadratic form 


Q(a, b, c, d) = 7a” + 14b° + 326c? + 81d + 18ab + 90ac 
+ 45ad + 134be + 66bd + 324cd. 


If we put 


18 28 134 66 

90 134 652 324 

45 66 324 162 
and V = (a, b, c, d), then V’AV = 2Q(a, b, c, d). So we wish to minimize 
V’AV. If we put 


V; = (1, 0, 0, 0), 
V2 = (0,1, 0,0), 
Vs; = (0, 0, 1, 0), 
V, = (0, 0, 0, 1), 


and base our definition of length on A, then we wish to find a shortest non-zero 
I. L. C. of Vi, V2, Vs, Vs. For this, it suffices to find a minimal G. C. F. of 
Vi, Ve, Vs, Vs. As Vi, V2, Vs, Va are independent, they constitute a 
G. C. F. of themselves. Hence, if we apply Operation IV to Vi , V2, Vs, Vs, 
we will arrive at a minimal G. C. F. 

In order to illustrate the ease with which the applications of Operation IV 
can be performed, we have prepared a table summarizing the computations. 
It is Table I at the end of the paper. The values of Vj AV; are useful in per- 
forming Operation IV, so in Table I we have listed the V’s along the top and 
left side, and have put the value of V; AV; under V; and to the right of V;. 
For 1 < i S$ 4and1 Sj S 4, the value of V/ AV; is just the component of A 
in the i-th row and j-th column. Note that V; is the shortest of the four initial 
vectors. So for Operation IVA, we seek to minimize one of V2, V3, Vs by 
integral use of V;. Specifically, we pick on V2. The n which minimizes 
L(V2 — nV;) is just the n which minimizes | V7 AV; — nV{ AV; |, as we showed 
earlier. By inspection, this n is unity. So we define V; = V2 — Vi, and 
replace V2 by V;. The values of V; AV; can be readily computed by noting 
that 








as it 
(see 


e) of 


4cd. 


nize 








GENERALIZATION OF EUCLIDEAN ALGORITHM 79 


VIA(V; “ nV») = VIAV; = nViAVee 


So the numbers under V; , V3, and V, and to the right of V; are obtained by 
subtracting the numbers to the right of V; from the numbers to the right of V2 . 
The values of V; AV; can be filled in immediately from the relation Vj AV; = 
V]AV;. Finally Vj AV; can be computed from the relation 


(Vi — nV,)"A(Vi — nV,) = VIAV; — 2nVi AV; + n'V;AV;. 


Note that we do not record V7 AV; or V; AV? as we are discarding V2 in favor 
of V;. Now V; is shorter than each of V; , V3, and V,, so that our next three 
uses of Operation IV are uses of Operation IVA, in which we successively 
minimize V; , V;, and V, by integral use of V;. Then for Vz , we minimize V; 
by integral use of Vs, which is an application of Operation IVB. For Vw, 
we return to Operation IVA again. Our first use of Operation IVC comes in 
finding Vis , after which we again return to Operation IVA. We observe that 
Vis gives Q the value unity, which is clearly a minimum, so we stop, although 
we do not yet have a minimal G.C.F. Soa = 1,b = 2,c = —1,d = 1 makes 
Q = 1, a minimum. 

This minimum is not unique, because there are other sets of values of a, b, 
c, d which also give Q the value unity. After we have considered the problem 
of finding all minimal G. C. F.’s we will find all sets of a, b, c, d which make Q = 1. 

We now prepare the way for Problem II by a short discussion. 

A well-known theorem’ states: 

Given n linear forms in » variables with arbitrary real coefficients and a 
determinant A ~ 0, one can choose integer values of the variables, not all zero, 
such that, simultaneously, each form S | A |’’" in absolute value. 

Naturally this raises the question of how to find the integers whose existence 
is stated. In the book Minkowski Geometry of Numbers by Harris Hancock, 
pp. 378-444 are devoted to a description and discussion of an algorithm for 
finding the integers in the case n = 3. No information is given for the case 
n = 4. Let us consider the four forms 


83lz + 1242y + 9132 + 502u, 


Li(2, Y, 2, w) 


Lo(x, y, 2, w) = 799x + 140ly + 2882 + 3llu, 


L3(x, y, 2, w) = 992a + 4lly + 7762 + 168u, 
L(x, y, z, w) = 321z + 889y + 17062 + 842w. 


For these forms, A = 255. So A‘ < 4. So there must be integer values of 
xz, y, z, and w which make | L; | S 3 simultaneously. 

ProsLem II. Find integer values of xz, y, z, w, not all zero, such that 
|Z; | S 3 simultaneously. 


2H. Minkowski, Geometrie der Zahlen, p. 104. 





80 BARKLEY ROSSER 


Putting 
Vi = (831, 799, 992, 321), 
V2 = (1242, 1401, 411, 889), 
V; = (913, 288, 776, 1706), 
V, = (502, 311, 168, 842), 


we see that Problem II is equivalent to the problem of finding integers z, y, z, w 
such that the components of xV, + yV2 + zV3 + wV, are all < 3 in absolute 
value. Clearly this vector rV; + yV2 + zV3 + wV; has length (in the usual 
sense) S 6. So if we can find all I. L. C.’s of Vi, V2, V3 , and Vs which have 
length < 6, one of them must be the vector xV; + yV2 + zV3 + wV,. Pre- 
sumably this problem could be solved by use of Construction 1 (with U = 0). 
However, to apply Construction 1 to Vi, V2, V3, V« directly would be im- 
practicable because there would be an enormous number of cases to try, each 
involving a most unpleasant amount of computation. Let us instead first find 
a minimal G. C. F. of Vi, V2, Vs, Vs. This can be done rather quickly and 
we get the following for a minimal G. C. F. 

U, = (0, 1, 2, 1), 

U, = (-—1, 3, —2, 0), 


Us; _ (—3, =~], l, —2), 
U's 


ll 


(—4, —4, I, 5). 


Already we have three I. L. C.’s of the V’s with all their components S 3 in 
absolute value, namely U,, U2, U3. A slight bit of trial and error discloses 


two more, namely 
U's + Uy = (—3, 0, 3, —1), 
= Th otk hh wd, oe 


These are probably all, but one could be sure by applying Construction 1 to 
find all I. L. C.’s of the U’s with length < 6, which could be done rather quickly. 
Of course we still have to exhibit an actual set of values of x, y, z, w. So let us 
find the x, y, z, w such that xV, + yV2 + zV3 + wV4 = U,. This merely 
amounts to solving the four equations L; = 0, Lz = 1, L3; = 2, L4 = 1 simulta- 
neously. We record the answer, together with the four other answers got by 
putting «V,; + yV2 + 2zV3 + wV, equal respectively to U2, Us, Us + U1, 
and (’; — U, in Table II at the end of the paper. 

We prepare the way for Problem III with certain explanatory remarks. 

A well-known theorem’ states: 


’G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, p. 169 








Zz, w 
lute 
sual 
ave 


Pre- 
0). 
im- 
ach 
find 
and 


3 in 


ses 











GENERALIZATION OF EUCLIDEAN ALGORITHM 81 


If a1, +--+ , 2%», are real numbers, not all rational, then the system of simulta- 
neous inequalities on the a’s, 
" —l/n . 
(A) ax; —a;| <a" (i = 1,2,-+-,n) 


has an infinity of solutions in integers. 

However, there is no well-known technique for finding such solutions.“ We 
shall propose a technique, but we are unable to prove that it always works. 
However, it has never vet failed when applied to a particular numerical problem, 
and we shall cite one severe test which it survived. To exhibit the capabilities 
of the technique, we shall solve a slightly more difficult problem than merely 
finding some random solutions of the inequalities (A). 

Prosiem III. Prove, by exhibiting the necessary a’s, b’s, and c’s, that for 
each z between 1 and 1,000,000 inclusive there are positive integers a, b, and c 
such that a < z and 

ar —b|<z i 


ay—c|<z 
Note that since a S z, the a, b, and c satisfy (A) with 2, = 7 and x = y. 
In order to solve this problem it is necessary to find, for various values of e, 
values of a, b, and ¢ such that 


ar—b\ <e, 
Qy—c|<e. 


If we put V; = (x,y), Ve = (—1,0), and V; = (0, —1), then the coordinates of 
aV, + bV. + cV; are am — band ay — c. So we wish to find short I. L. C.’s of 
Vi, V2, V3. To this end we apply Operation III repeatedly. By making a 
judicious selection from the shorter and shorter I. L. C.’s of Vi , Ve , Vs which 
we obtain by use of Operation III, we were able to make up Table III at end of 
paper, which solves Problem III by exhibiting sets of values of a, b, c, each 
of which satisfies the desired inequalities for a given range of z, and such that 
the z-ranges overlap and cover the interval from 1 to 1,000,000. 

As a test of the technique, I undertook to find solutions of (A) when 2 = 
log 3/log 2, 22 = log 5/log 2, x; = log 7/log 2, x3 = log 17/log2. Here one would 
put 

V; = (21, %2,%3, x4), 
V2 = (-—1, 0, 0, 0), 
V; = (0, —1, 0, 0), 
V, = (0,0, —1, 0), 
V; = (0, 0, 0, —1) 


4 When n = 1, the solutions can be found by continued fractions. However Jacobi’s 
generalized continued fraction algorithm fails to furnish solutions for n > 1 except for 
special cases. See O. Perron, Grundlagen fiir eine Theorie des Jacobischen Kettenbruchal- 
gorithmus, Math. Ann., vol. 64(1907), pp. 1-76, especially pp. 20-23. 








82 BARKLEY ROSSER 


and search for short I. L. C.’s of the V’s. To carry out the search, one would 
apply a generalization of Operations II, III, and IV. This generalized operation 
was very successful in finding solutions of (A), and continued to be successful 
as far as the computation was carried. In order to make the test a severe one, 
the computation was carried quite a way, so far in fact, that it was necessary to 
use fifteen decimal place approximations for the logarithms toward the end of 
the computation. A sample solution of (A) (this was obtained near the end of 
the computation) is a = 60,243,195, a, = 95,483,205, a2 = 139,880,367, a; = 
169,124,030, a; = 246,241,821. Note that this is not merely a solution of (A) 
but satisfies the stronger inequalities 


| ax; — a; | ei"? = 1, 2, 3, 4). 
We postpone further applications until after more development of the theory. 


THEroreM 12. If U,, Vi is a minimal G. C.F. of U, V, then U, is at least as 
short as the shorter of U and V, and V, is at least as short as the longer of U and V. 

Proof. By definition, U; is a shortest non-zero I. L.C. of U and V. Now U 
and V are independent, since otherwise their G. C. F. would consist of a single 
vector. So both U and V are non-zero I. L. C.’s of U and V. Hence U; must 
be at least as short as the shorter of U and V. Since U and V are independent, 
they are not both dependent on U,. Hence at least one of them is an I. L. C. 
of U and V which is independent of U; , and so must be at least as long as Vi , 
by the definition of a minimal G. C. F. So V; must be at least as short as the 
longer one of U and V. 

TuHeoreM 13. If U,, Vi, Wi isa minimal G. C.F. of U, V, W, then U, is at 
least as short as the shortest of U, V, W, and V, is at least as short as the second 
shortest of U, V, W, and W, is at least as short as the longest of U, V, and W. 


The proof is similar to that of Theorem 12. 

THeoreM 14. Jf U,, Vi, Wi, X1 is a minimal G. C. F. of U, V, W, X and 
L(U) s L(V) s L(W) s L(X), then L(U)) S L(U), L(V) S L(V), L(Mi) S 
L(W), and L(X,) S L(X). 

The proof is similar. 

THeoreM 15. If U;, Vi and U2, V2 are both minimal G. C. F.’s of Ri, --- , 
R,, , then L(U,) = L(U2) and L(V;) = L(V32). 

Proof. Since U, , Vi and U2 , V2are both minimal G.C. F.’s of Ri , --- ,Rn, 
it follows that U, , Vi is a minimal G. C. F. of U2, V2. So by Theorem 12, 
L(U;) = L(U2) and L(V;) S L(V2). Interchanging U,; , V; with U2, V2, we 
get L(U2) S L(U;) and L(V2) S L(Vi). 

TueoreM 16. Jf U;, Vi, Wiand U2, V2, We are both minimal G. C. F.’s 
of Ri, --- , Rm, then L(U;) = L(U2), L(V) = L(V2), and L(W;) = L(W3). 


The proof is similar. 








is 


~ 








GENERALIZATION OF EUCLIDEAN ALGORITHM 83 


, W2, Xe are both minimal 
au L(V:), L(W;) = L(W2), 


THeorEM 17. Jf Ui, Vi, Wi, paaey! Use, Ve 
G.C.F.’sof Ri, +++ , Rm, then L(Uy) = L(U2), L(V 
and L(X;) = L(X2). 


The proof is similar. 

We now make a few remarks preliminary to the statement of Problem IV. 

Put L(x, y, z, w) = 9852 + 408y — 7802 — 571lw. A general solution in 
integers of L = ky is” 


= 5k, + 8k2 + 3ks + Oki, 
y = 5k; + 11k2 + 10k3 + 5ka, 
6k; + 10k2 + 20k3 + 3k, 
w= 4k, + Ske — 15ks + 15ky, 


& 


x 
ll 


where k; is the same as in the equation L = k; , and ke , ks , ks are arbitrary. 
If we put 


V = (2, y, 2, w), 
Vi = (5, 5, 6, 4), 
V2 (8, 11, 10, 8), 


Vs; = (3, 10, 20, —15). 
Va = (9, 5, 3, 15), 


then our general solution takes the form 


4 
y= KV: 


i=l 


Prosiem IV. Find a general solution in integers 


x = Ayky + Ayko + disks + Auk , 
Y = Anky + dake + desks + Aoks , 
Z = Asiky + Asgok2 + asgsks + Asaks , 


wW = Auk; + dyke + disks + Aasks 


of L = k; for which > (ai;)’ shall be a minimum. 
An equivalent statement of the problem is clearly the following. 


5 Jacobi is usually credited with being the first to give a systematic method for finding 
such solutions. However in his treatment of the question (Werke, vol. 6, pp. 355-384) he 
contrasts certain of his results with results of Euler. The solution given here is obtained 
by the method of Rosser, A note on the linear Diophantine equation, Amer. Math. Monthly, 
vol. 48(1941), pp. 662-666. 








84 BARKLEY ROSSER 


Prospiem IV. Find a general solution in integers 


(where the U’’s have integer components) of L = k, for which [L(U))}’ + [L(U2) 
+ [L(U;)} + [L(U))/ is a minimum. 
We first make a few remarks about any general solution 


$ 
V — : k; W; 


i=l 
of L = k, (where the W’s have integer components). Putting k; = 1, ke = 
kz; = ky = 0, we see that W, is a solution of L = 1. Also putting k; = 0, we 


see that 
4 


V => kW 


is a general solution in integers of L = 0. In other words, every lattice point 
on the hyperplane L = 0 is an I. L. C. of We, Ws, Ws and conversely. Note 
that (408, —985, 0, 0), (780, 0, 985, 0), and (571, 0, 0, 985) are three obvious 
independent lattice points on L = 0. Hence We. , W;, and W, must be inde- 
pendent. Let X.,X;,X,be any G.C. F.of W2,W3;,W,s. Thenany I. L.C. 
of W., Ws, Wyis an I. L. C.of X. , X3 , X4 and conversely, so that any lattice 
point on L = OisanI. L. C. of X. , X3 , X, and conversely. So 


4 
V = } kX; 


is a general solution of L = 0. Now let X, be any solution of L = 1. Then 
4 
J = > kX; 
i=l 


is a general solution of L = kh, , which we show as follows. Let V be any solution 
of L=k,. Then V — kX; isasolutionofL=0. So 
4 
V — bX, = > kX. 
By similar arguments, one can show conversely that if 
4 
V = KX: 
i=l 
is a general solution, then X; is a solutionof L = 1 and X2 , X3, X,isaG.C. F. 
of W., W3, Ws. 
Now choose W, , W; , W,a minimal G. C. F. of V2, V3 , Vs , namely 
W, = (5, —2, 6, —1), 
W; = (1, —6, —7, 7), 


= (3, 13, 4, 9). 


_ 
— 
‘ 

oa 





ve 


on 


yn 











GENERALIZATION OF EUCLIDEAN ALGORITHM 85 
Also take W, a shortest vector of the form V; — m2V2 — m3V3 — mV, (see 
Construction 2), namely 
W, = (0, 7, 0, 5). 


By our method of choosing W, , it is clearly a shortest solution of L = 1. By 
the previous discussion 


4 
V = DEW; 
i=] 


is a general solution of L = k;. We now undertake to show that it is the one 
which we are seeking. Let 


I 
M- 
co 


i=l 


be a solution with SIL U;)}//}aminimum. By our previous discussion, W.2 , Ws, 
W,isaG.C.F.of U2, U3, Us, and hence isa minimal G.C.F. Soby Theorem 
13, [L(W2)) + [L(Ws)} + [L(W)P S [L(U.))P + [L(U3) + [LU df. Also 
U,isasolution of L = 1. As W,isashortest suchsolution, {L(W,)]’ < [L(U)P. 

We pause to remark on the role of Construction 2 in the discovery of W, . 
As indicated in the discussion following Construction 2, the shorter the vector 
one starts with, the less labor is involved in finding a minimum. So before 
applying Construction 2, we minimized V, by integral use of W.. This gave 
W,. Then we attempted to minimize W, by integral use of Ws; or W, or by 
unit use of combinations of W., W;, W;. When these attempts failed, then 
it was time to use Construction 2. 

ProsiemM V. Let L(z, y, z) = x log 2+ y log 3 + z log 5. Find the least 
positive value which L(a, b, c) can assume subject to the restriction a’ + b* + ¢ 
< 10000, and find the a, b, and ¢ which produce this value. 

Clearly an alternative way of stating the problem is the following. 

Prosiem V. Find that lattice point distinct from the origin and lying within 
the sphere x° + y’ + 2” = 10000 which is closest to the plane L(x, y, z) = 0. 

We proceed as follows. Take a linear form with integer coefficients, K(x, y, z), 
such that the planes L = 0 and K = O nearly coincide. Then if there are 
lattice points on K = 0 within the sphere x” + y* + z’ = 10000 (we shall hence- 
forth refer to this sphere as oc), one of them is likely to be the point for which we 
are looking. So one restriction on the coefficients of K is that they should be 
small enough so that there are likely to be lattice points (besides the origin) 
on K = 0 and inside ¢. This means, roughly speaking, that there should be 
lattice points on K = 0 whose components are numerically < 100. So the 
coefficients of K should be numerically < 10000. For, let the coefficients of K 
each be numerically < 10000. Consider the three linear forms: 


100K, y, z. 


Their determinant, A, is numerically < 1,000,000, and so (by the theorem 
quoted just before Problem II) there are integer values of the variables, not 








86 BARKLEY ROSSER 


all zero, which make all three forms simultaneously < | A |'’* < 100 inabsolute 
value. As these integer values make K an integer, and as they make | 100 K | < 
100, they must make K = 0. So there are integer values of xz, y, and z, with 
ly | < 100,|z| < 100, which make K = 0. It does not follow, of course, that 
|2| < 100, but from the fact that K = 0 and | y| < 100 and |z| < 100, it 
follows that | x | cannot be too large. So we wish a K with coefficients numeri- 
cally < 10000 such that K = 0 and L = 0 shall coincide as nearly as possible. 
For this purpose we wish three integers k; , k2 , ks , all < 10000, which are nearly 
in the ratio log 2, log3, log 5. Hence 

k; log 2 — ky log 5, 

k; log 3 — ke log 5 
must be small. As these are the components of kiV; + k2V2 + k3V3, where 

Vi = (— log 5, 0), 

V2 = (0, — log 5), 

V3; = (log 2, log 3), 


we search for short I. L. C.’s of Vi , V2, V3. This search can be carried out by 
applying Operation III to the three vectors (see Problem III). Using the values 
of ky , ke , ks which we get by use of Operation III, we take K to be 


42962 + 6809y + 9975z. 
Put a = ((4296)* + (6809)* + (9975)°)'”. We find that the angle, 6, between 


K = 0 and /, = 0 is approximately (0.00273/a) radians. So at the surface of 
o, the maximum separation between K = 0 and L = 0 is approximately 1008, 
or 0.273/a. As the distance between the planes K = 0 and K = 1 is 1/a, we 
see that any point on K = 0 in o must be closer to L = 0 than any point on 
K = line. Likewise any point on K = 1 in o must be closer to L = 0 than 
any pointon K = 2ine. Andsoon. Note that any lattice point in space is 
on some one of the planes K = m for an appropriately chosen m. For if (a, b, c) 
is the lattice point, choose m = K(a, b,c). From all the above, we deduce the 
following procedure for solving our problem. First, find all lattice points on 
K = Oine. If there are any (besides the origin), the closest to L = 0 is our 
answer. If there are no lattice points (except the origin) on K = 0 in a, find 
all lattice points on K = 1 ing. If there are any, the closest to L = 0 is our 
answer. If there are no lattice points on K = 1 in g, find all lattice points on 
K = 2ine. And so on. 

As a preliminary, we find a general solution in integers of K = k, for which the 
sum of the squares of the coefficients is a minimum (see Problem IV), namely 


3 
V => KV:, 


i=l 














GENERALIZATION OF EUCLIDEAN ALGORITHM 87 


where 


Vi = (54, —37, 2), 

V2 = (90, 15, —49), 

Vs; = (71, —99, 37). 
By the method which we used to find this solution, V2 is a shortest non-zero 
vector with integer coefficients lying on K = 0. As L(V2) > 100, there is no 
lattice point (except the origin) on K = Oino. However V; ison K = 1 and is 
ino. If we find all vectors of the form Vi; — m2V2 — m3V3 with length < 100 
(see Construction 1), we will have all lattice points on K = ling. The set of 
all lattice points on K = 1 and in o consists of 


(54, —37, 2), 
(17, —62, 35), 
(36, 52, —51). 


Of these, the first is closest to L = 0, and we have 
54 log 2 — 37 log 3 + 2 log 5 = 0.000169 


as the answer to our problem. If one is willing to increase the size of @ slightly 
in order to get a point much closer to L = 0, one would have 


— 90 log 2 — 15 log 3 + 49 log 5 = 0.000027. 


THEOREM 18. Let U;, Vi and U2, V2 be minimal G. C. F.’s of Ry, --- , Rm. 
Then U2 and Vz must be two of +U1, V1, U1 + Vi. Also of L(Ui) < 
L(V), then U2 is one of +U,. 

Proof. Put f(z, y) = [L(aU, + yVi)). Since Ui, Vi is minimal, we cannot 
perform Operation II on U; , Vi , because if we could, we would get a U;, Vs 
with L(U;) + L(V3) < L(U1) + L(¥;), contradicting Theorem 12. Since we 
cannot perform Operation II on U,, Vi, f(x, y) satisfies the hypothesis of 
Theorem 6 (see the argument in the beginning of the proof of Theorem 9). 
Now U2 is an I. L. C. of U; and V;, say U2 = aU; + bV;,. Then f(1, 0) = 
[L(U)} = [L(U2)? = [L(aU, + bV)P = f(a, b). Note that the step [L(Ui)}’ = 
[L(U:)}° is based on Theorem 15. If L(U:) < L(Vi), then f(1, 0) < f(0, 1). 
By Theorem 6, f(a, b) = f(0, 1) if b ¥ 0, so if L(U,) < L(Vi), b = 0 and so 
a=-+1. If L(U;) = L(V), then f(1, 0) = f(0, 1), so that we can have b ¥ 0. 
However, we wish to show that in any case |a| S 1 and |b| < 1. Suppose 
|a|>1and!b|>1. Thenthea in Theorem 6 22, contradicting f(a, b) = 
f(0, 1). If either of | a | or | b| is zero and the other >1, then a 2 2 again. 
If one of |a| and |b| >1 and the other =1, then |a| ¥ | b| and neither is 
zero and f(a, b) > f(0, 1) by Theorem 6. So|a| SZ land|b| 31. SoUs2 
must be one of the eight vectors mentioned. If we put Ve = cU; + dV; and 
apply a similar argument, we get |c | S land|d| <1. 








88 BARKLEY ROSSER 


TueoreM 19. Let U;, Vi, Wi and U2, V2, We be minimal G. C. F.’s of 
R,,-+-,Rm. Then U2, V2, and We must be I. L. C.’s of U1, Vi, Wi of the 
form al’; + bV; + cWi, where'a| S 1,/b| Sl,and{c| Ss 1. If L(U)) < 
L(V), then Uzisoneof+U,. If L(Vi) < L(W,), then the allowable combinations 
for U,and V2 cannot contain W, . 

The proof is similar to that of Theorem 18. One point deserves mention. 
We start off by putting f(z, y, z) = [L(2U1 + yVi + zWi)P. We put U» = 
al’, + bV, + cW,, so that f(a, b, c) = f(1,0,0). If one of a, b, or cis zero, we 
proceed as follows. Suppose a = 0. Then think of f(0, y, z) as a quadratic 
form g(y, z) and note that g(y, z) satisfies the hypothesis of Theorem 6. Then 
the argument of Theorem 18 will show that |b! < land|c! <1. The cases 
where none of a, b, or c is zero make use of Theorem 7. 

THEOREM 20. Let U,,Vi,W1,X,and U2, V2, We , X2 be minimal G. C. F.’s 
of Ri, +++ Rm. Then Us,, V2, We , and X_ must be I. L.C.’s of U1, Vi, Wi, Xi 
of the formal’; + bV; +cW, + dXy, where either all four of a, b,c, d have absolute 
value < 1 or else exactly one of a, b, c, and d has absolute value 2 and the rest have 
absolute value1. If L(U,) < L(Vi), then Ucisoneof +U,. If L(Vi) < L(W,), 
then the allowable combinations for U2 and V2 cannot contain W, or X,. If 
L(W,) < L(X)), then the allowable combinations for U2, V2, and W2 cannot 
contain X, . 

Proof. The proof proceeds just like the proof of Theorem 19 up to the point 
where none of a, b, c, or d is zero and f(a, b, c,d) = f(0,0,0,1). If all of ja), 

b|,!e!,!d!are = 2,then in Theorem 8, 6 = 3, and we have a contradiction. 
So let one of them be 1. Now if there are three different values among | a 
b!, |e!, |d|, we have a contradiction again. Also if the two largest of 
a|,|b!,{!e|,|d| are equal and 2 2, we can use the lemma of Theorem 8 to 
obtain a contradiction. So we are reduced to the case where three of | a\ , | b 
e|,{!d| equal 1 and the remaining one = 1. We wish to show that the re- 
maining one S 2. The situation which we have corresponds to the situation 
that would result in Case 1 of Theorem 8 if we put y = z = w= 1. By going 
through the argument for this case, we see that f(x, 1, 1,1) > (0,0,0,1) ifz > 2. 
To see a case where one of a, b, c, or d would have to be 2, take the usual 


definition of length and let the V’s be 
(2, 0, 0, 0), 
(0, 2, 0, 0), 
(0, 0, 2, 0), 
(0, 0, 0, 2), 
(1, 1, 1, 1). 


Take the first three and the last to be U; , Vi , W: , X; and the last four to be 
Us, V2, We, Xe. 





be 








GENERALIZATION OF EUCLIDEAN ALGORITHM 89 


As an example of a case where there is a large number of minimal G. C. F.’s, 
take the usual definition of distance, and let 


(1, 0, 0), 


1 V3 
G. 2 .0), 


ssa va) 
2’ 2/3’ V6 


be a minimal G. C. F. of a set of R’s. (The R’s could consist of the minimal 
G. C. F. itself, of course.) Then if Us, V2, We is another minimal G. C. F., 
+l, could be any one of 


(1, 0, 0), 


+V-, could be any one of the remaining five, and +W, could be any one of three 
or four others. By trial, we find that sixteen distinct combinations of three out 
of the above six vectors will serve as a G. C. F. As + or — signs may he 
attached to three vectors in eight ways, we get a total of 8 X 16G. C. F.’s. 
As these vectors are all the same length, they may be taken in any order to 
produce a minimal G. C. F., and as the order is a distinguishing property of 
minimal G.C. F.’s, we find that there are a total of 6 X 8 X 16 or 768 possible 
minimal G. C. F.’s. 

The purpose of these last three theorems is to enable one to find all minimal 
G. C: F.’s. For this, we proceed as follows. First, find a minimal G. C. F. by 
use of Operation II, III, or IV. Then the vectors of any other minimal G. C. F. 
must lie among the combinations listed in Theorems 18, 19, and 20. By trying 
all possible combinations, we can find all minimal G. C. F.’s. The information 
in Theorems 15, 16, and 17 is useful in eliminating unusable combinations. 
In this connection it is worth remarking that if U; , Vi , Wi, X: is aminimal 
G.C. F. of Ri, ---,R, and Us, V2, We, X2isaG.C. F.of Ri, --- ,R, and 
L(U,) = L(U2), L(V) = L(V2), L(Wi) = L(W2), L(X1) = L(X2), then U2, 


= 











90 BARKLEY ROSSER 


V2, We, X2 is a minimal G. C.F.of Ri ,---,R». Forif U2, V2, We, X2is 
not minimal, then by Theorem 11, Operation IV can be performed, giving U; , 
V3 9 Ws; P X3 with L(U3) + L(V3) + L(W3) + L(X3) < L(U2) + L(V2) + 
L(We) + L(X2) = L(U4) a L(V) + L(W,) + L(X)). This contradicts 
Theorem 14. Note that the result just stated does not hold without the condi- 
tion that U2, V2, W2, X2beaG.C.F.of Ri, --- ,R». In fact, it is possible 
for U2, V2, Wz , X2tobe such that U2 is a shortest non-zero 1. L.C. of Ri, --- , 
R.. , V2 is a shortest I. L. C. of Ri, --- , Rm which is independent of U2 ,etc. 
without Us, V2, We, X2 beingaG. C. F.of Ri ,---,R». Under these cireum- 
stances U2 , V2 , Wz , Xz. naturally cannot be a minimal G.C.F. An illustration 
of this situation would be where 

R, = (2, 0, 0, 0), 

Ry (0, 2, 0, 0), 

Rs bad (0, 0, 2, 0), 
— (0, 0, 0, 2), 
(1,1,1, 1), 
and U, = Ri, V2 = R:, W2 = R3, X2 = Ry. The R’s have a G. C. F.,for 
instance, R, , R: , Rs, Rs, but Us , V2 , We , X2 is not a G.C.F., although 
all four of them are shortest non-zero I. L. C.’s of the R’s. 

We now consider some further matters related to Problem I. Let A, Vi , 
V2, Vs , Va be as in Problem I, and let us find all minimal G. C. F.’s of Vi , Vo, 
V; , Vs, basing our definition of length on A. We continue applying Operation 
IV until a minimal G. C. F. of Vi , V2, Vs, Vs is attained. Then we list all the 
combinations mentioned in Theorem 20. From these we eliminate a large 
number of unsuitable ones by use of Theorem 17. We get finally the following 
six vectors, the first three of length +~/2, and the last three of length 2. 

Ry (1, 2, i, 1), 

Rz _ (—2, —6, 3, —3), 

R; - (—1, —4, 2, —3), 

R, = (0, 0, 1, —2), 

Rs - (0, —5, 2, —2), 

R, = (0, —5, 3, —4). 
If U, V, W, X isa minimal G. C. F. of Vi, V2, Vs, Va, then +U and +V can 
be any two of R, , R2, Rs; and +W, +X can be any two of R,, Rs, Rs. So 
there are a total of 576 possible minimal G. C. F.’s of Vi , V2, Vs, Va- 

Now consider the problem of finding all minima of the quadratic form Q 
given in ProblemI. +1, +R2, and +R; each gives a minimum, but do they 
give all the minima? The fact that +R:, +R. , +R; are all the possibilities 


for U does not enable us to conclude that they are all shortest non-zero I. L. C.’s 
of Vi, V2, Vs, V4, because we have not proved the impossibility of having a 


ty 
ua _ 
i ou 





for 
igh 


1; 
Vo, 
on 
che 
ge 
ng 








GENERALIZATION OF EUCLIDEAN ALGORITHM 91 


shortest non-zero I. L. C. of Vi , Ve, V3 , Vs which is not a constituent of some 
minimal G. C. F. of Vi, Ve, Vs, V4. However, if one observes the proof of 
Theorem 20 carefully, one will see that it is there proved that the combinations 
listed will include all shortest non-zero I. L. C.’s, although this is not stated in 
the theorem itself. So +R, ,+R., +R; are all of the minima of Q. 

As we remarked while solving Problem I, A is the matrix of the quadratic 
form 2Q. If we make a transformation of coordinates with matrix P, the new 
quadratic form will have matrix P’AP.° If we wish the transformation to leave 
the number-theoretic properties of Q unchanged, P must have integral com- 
ponents and determinant +1. So we say that two symmetric positive definite 
matrices, A and B, are equivalent (in the number-theoretic sense) if there is a 
matrix P with integral components and determinant +1 such that B = P’AP. 

Prosiem VI. With A as in Problem I and 


34 -3 36 92 

—-3 2 -—4 —10 
36 -—-4 44 = 110 
92 -10 110 276 


determine if A and B are equivalent, and if they are, find P. 

First, a few general considerations. Suppose A and B are equivaient, say 
B= P’AP. Let V beany vector. Then the length of V relative to B, Lg(V), 
is equal to (V’BV)"” = ((PV)"A(PV))'” = L4(PV). So if U is a shortest 
non-zero I. L. C. of Ri, --- , Rm relative to B, then PU is a shortest non-zero 
I. L. C. of PR, , --- , PR» relative to A. Let 


Si — (1, 0, 0, 0), 
Se = (0, 1, 0, 0), 
Ss = (0, 0, 1, 0), 


S4 - (0, 0, 0, 1), 
and let U, V, W, X be a minimal G. C. F. of S; , S:, Ss , S« relative to B, for 
instance 


U= (0, 1, 0, 0), 
V= (1, 0, 4, —2), 
W = (0, 0, 5, —2), 


X = (-—2, 0, —6, 3). 
Then PU, PV, PW, PX is a minimal G. C. F. of PS, PS2, PSs, PS, relative 
to A. However, since the components of P are integers, the components of 
PS,, PS: , PS;, PS, are also integers, and so each of PS; , PS: , PS; ,PS, 
isan I. L. C. of S;, S2, S3, Ss. Conversely, any vector 7 with integer com- 


6 M. Bécher, Introduction to Higher Algebra, p. 129, Theorem 1. 











92 BARKLEY ROSSER 


ponents is an I. L. C. of PS; , PS: , PS;, PS, , because if we try to solve 
aPS,; + bPS. + cPS; + dPS, = T for a, b, c, and d, we find that we have 
four linear equations to be solved whose matrix is P, and so whose determinant 
is +1. Hence the equations can be solved in integers. So PS; , PS2, PS;, 
PS,isaG. C.F. of 8: , S2, S3, Sa. So any minimal G. C. F. of PS, PS2, 
PS; , PS, , for instance PU, PV, PW, PX, is a minimal G. C. F. of S,, Ss, 
S;, Ss. So one of the minimal G. C. F.’s of S; , S2, S3, S4 relative to A (all 
of which we found a while back) must be PU’, PV, PW, PX. If we can guess 
the right minimal G. C. F. relative to A, we can solve for P. Just to get going, 
let us conjecture that perhaps R,, R., Ry, Rs is PU, PV, PW, PX. Before 
solving for P, it is well to test the reasonableness of this conjecture. We find 
that La(Ri) = Le(U), La(Re) = Le(V), La( Rs) = Le(W), La( Rs) = Le(X). 
This is as it should be. However U’BV = (PU)"A(PV). So we should have 


U"BV = RIAR.. As a matter of fact, we have instead U’BV = —RIAR,. 
This suggests replacing Reby —R:.. For asimilar reason we replace R; by —R; . 
Now we can find no reason why we should not have R; = PU, —R: = PV, 
R, = PW, and —R; = PX, so we solve these for P, getting 

—6 1 -8 -—20 

—-28 2 -34 -—85 

P= 

13-1 17 42 

-13 1 —-18 —44 
With this value of P, we have | P| = 1 and B = P’AP. SoA and B are 


equivalent. 

The reader may wonder whether it was just luck that we found P so readily. 
Clearly, with 576 possible minimal G. C. F.’s relative to A to choose from, if 
only one would have given P, then we certainly were lucky. However, as a 

“- . . ’ T 
matter of fact, there are quite a large number of P’s such that B = PAP, 
and had we tried some other minimal G. C. F. relative to A, we would have 
merely found some other P. The fact is that there are quite a number of P’s 

. . . T ‘ 
with integral components and determinant +1 such that P-AP = A. Such 
a P we call an automorphism (in the number-theoretic sense) of A. By using 
two different properly chosen minimal G. C. F.’s relative to A, we can find 
automorphisms of A. For instance, if we choose P so that PR; = R, , PR. = 
—R,, PR; = —R,, and PR, = —R;, we get 

-1 0 0 0 
-6 —-4 -—25 -15 
3 3 16 9 
-3 -4 -20 -11 


Then | P | = 1 and A = P’ AP, so that P is an automorphism of A. 
As any automorphism of A carries a minimal G. C. F. into a different minimal 








solve 
have 
inant 
PSs , 
PS, 
, Se, 
| (all 


ress 
ing, 
fore 

find 
(X). 
lave 
Re. 

R;. 
PV, 


mal 





GENERALIZATION OF EUCLIDEAN ALGORITHM 93 


G. C. F., and as there are only a finite number of distinct minimal G. C. F.’s, 
there can be only a finite number of automorphisms of A. As the automor- 
phisms of A form a group, it follows that any automorphism of A must be of 
finite order. For the P given above, P* = 1. In order to count the number of 
automorphisms of A, note that if Y,, Ye, Y3, Y4 and Z;, Z2, Z; , Zs are two 
minimal G. C. F.’s such that Y7 AY; = Z] AZ; for all i and j, then there must 
be an automorphism P such that Y; = PZ;. To show this, first choose G 
and H so that GS; = Y; and HS; = Z;. Then since ¥;, Yo, Y3, Yaisa 
G. C. F. of S,;, Se, S3, Ss, it follows that G has integral components and 
determinant +1. Similarly for H. Now Y7AY; = Sj](G’AG)S; = the ele- 
ment in the i-th row and j-th column of G’AG. Similarly Z7AZ; = 
S?(H"AH)S; = the element in the i-th row and j-th column of H’AH. So, 
since Y; AY; = Z; AZ; , the matrices G’AG and H’AH are identical. From 
"AG = H'AH, we clearly get (GH"')"A(GH_') = A. So GH" is an auto- 
morphism of A. However (GH™')Z; = (GH™')(HS;) = GS; = Y;. To facili- 
tate choosing such sets of Y’s and Z’s, we list the values of R? AR; in Table IV 
at the end of the paper. We might as well choose the same Y’s every time, say 
Y, = Ri, Yo = Ro, Y3 = Ry, Ys = Rs. Then Z, can be any one of +i, 
+R:, +R;. Whichever of these we choose for Z,, if any of the remaining 
five be chosen for Z2 , it will give Z{ AZ» the right value, but only two can be 
chosen which will give Z{AZ> the right value. So there are twelve possible 
choices for Z; , Z2. Similarly Z; can be any one of +R;, +R; , +e, with 
two choices left for Z;. So there are 144 different automorphisms of A. This 
count takes P and —P as distinct automorphisms, so that there are 72 essen- 
tially distinct automorphisms of A. 

Prof. R. J. Walker has made an interesting suggestion. In several of the 
problems which we considered, we are primarily interested in obtaining I. L. C.’s 
with small components, and we were able to solve the problems by use of our 
algorithm only because the short vectors which are obtained by use of it neces- 
sarily have small components. This suggests that we define the length of a 
vector to be the absolute value of its numerically largest component. This 
would give quite a satisfactory metric. Furthermore, with this metric the 
problem of finding a shortest vector of the form U — nV could be easily solved. 
Hence we could define Operations II, III, and IV as above, only with this new 
metric in mind. The particular cause which is responsible for the failure of the 
algorithms of this paper in the case of five or more dimensions would fail to 
apply .to the new metric. Hence, with the new metric there is at least a pos- 
sibility that there always exists a minimal G. C. F. in any number of dimen- 
sions, and that a generalization of Operations II, III, and IV (perhaps with slight 
modifications) will suffice to find a minimal G.C.F. However, as yet we cannot 
prove anything useful about the new metric in more than two dimensions. 


CoRNELL UNIVERSITY. 


i a PP = 

















z € it I we Hh BY @ (I CT) 
tol | ft kh ; 1 | | m= A= 08-0) 
: 81 I— |2 L 7 = "AZ — ay = (1 ‘0 ‘T- ‘I-) T 
¥9 6I & | : ¢ 2: i 2 ay = 14g —-*4 =(1 —_ ‘¢‘g) 
e]r lt—lerlor e-| ior | | b-l | | tg ot 4 = OTH 88) 
tlzk je wb | it | & ol | | og = 4a — "4 = OT F- T-) 

= | Of 12 0 ZI . SA = °AS = LA = (0 ‘I ‘Z- ‘¢—) - 

L ) | | fe Cl | 1% 06 |TZI og g "A= %e—*A = (10'S ‘8) 

s a7. fhe | 121 loge 09 : LA) , : a= eee (0 ‘I ‘L— ‘L) et 

: | | | | i |o joe oo (et je—lre jor "A= %M-4=(0'0‘I- 2) 
tir | € z- lo | ZI €& @ Z—9 itz bP F 7 = ts « *4 = (0 ‘0 ‘T ‘T—) 
fe erie. 4 lat ie tz eorreeoo (or) tA = (E000) 
tT tT ft Py yy OF FF FEZSO rer 06 | ‘A= O10%) 
ca t.t ta A ee | | ooeree [st] “= © t% 
ft ft PP yy yey + gr 06 st | #1 14 = O00 
“aval al eal mala) ea lea | ea lealealealeal a | 








_ 
~ 


I AIAVL 

















GENERALIZATION OF EUCLIDEAN ALGORITHM 








































































































TABLE II 
- x a i wi 7 2 w I, Lz L; | I 
"2557879 | 2869678 | 3909845 | —9976556 | 0| 1 2 | 7 
"4248882 | 4766810 | 6494627 | —16572015 | —1 3|-2| o 
3586719 | 4023931 | 5482478 | —13980365| —3 | —1 1| -2 
6144598 | 6893609 | 9392323  —23965921 | —3 0 g| <j 
— 1028840 | 1154253 | 1572633 | —4012809 | -3  -2, -1| -3 
TABLE III 
oe = 
ens ae a ie wai a I<es er. 
2 ‘ b> “4 2<:< 12 
7 | 2 "sae 7<:< 600 
466 | ~~ 1464 269 | 466<2< 3100 
1010 3173 | 583 1010 <z< 6600 
2493 7832 1439 2493 <z< 11,000 
8023 | 25205 4631 8023 <z< 210,000 
62701 | 196981 36192 62701 < z < 1,000,000 
TABLE IV 
oe R, | Rs Rs Rs Re 
-_“rt © Ve fer 0 i 
Rs - 2 1 0 0 0 
R; 1 1 2 0 0 0 
Rs 0 | o 0 4 ~2 2 
Rs 0 0 o | -2 4 2 
Rg 0 0 0 2 2 4 











POSITIVE DEFINITE FUNCTIONS ON SPHERES 


By I. J. SCHOENBERG 


1. Introduction. Let S. denote the ordinary spherical shell of radius one 
and center o and let p:, po, +--+, pn be n arbitrary points of S.. Let pi, 
denote the spherical distance between the points p;, p;. For n real variables 
41, X2,°**, , we have the following inequality 


— 


(1.1) | >> ops-xi P= > cos (pipe) rir, = O, 
t,k=l 


which is equivalent to the determinant inequality 
det | cos (pipx) |i, 2 O 


for arbitrary points p; and arbitrary n. This property of the function g(t) = cost 
in relation to the space S: is expressed by saying that cos ¢ is positive definite 
in S.. The general definition is as follows. Let M be a metric space with the 
distance function pg. A real continuous function g(t) (0 S ¢ S diameter of M) 
is said to be positive definite (p. d.) in M if we have 


n 


(1.2) Dd gpipx) rixx = O, 
i,k=1 
for any n points pi, --- , px of M, arbitrary real xz; , and all n = 2, 3,---. 


We denote this class of functions by the symbol 2(M/)._ It enjoys the follow- 
ing useful closure properties: 

I. If g(t) «e BUA), go(t) « BLM), also cygi(t) + coge(t) « B(M), provided c, = 0, 
C = 0. 

II. The same assumptions imply also that g:(t)go(t) « BCA). 

Il. If gn(t) e BUM), g,(t) — g(t) asn — &, and g(t) is continuous, then also 
g(t) e BUM). 

In the present note we are concerned with the classes {(S,,.) and B(S_) 
corresponding to the unit spheres in the Euclidean space £,,,; and the Hilbert 
space H respectively. 

Returning to S. , we have noticed that cos te B(S2). It will be shown below 
that also i’,(cos ¢) e B(S2), where P, is a Legendre polynomial. By the above 
mentioned closure properties it is now apparent that also 


(1.3) g(t) = >> a, P,(cos t) € B(Se), 


n=0 
provided a, = 0 (n = 0, 1,---) and >° a, converges. This formula will be 
shown to furnish the most general element of B(S2). 


Received August 25, 1941; presented to the American Mathematical Society, September 
11, 1940. 
1 See [6], §2. Numbers in brackets refer to the bibliography at the end of the paper. 


96 








one 
D iPr 
bles 


os t 
nite 
the 
M) 


OW- 


-¢ 








POSITIVE DEFINITE FUNCTIONS ON SPHERES 97 


Let P2 (cos t) be the ultraspherical polynomials defined by the expansion 
(1.4) (1 — 2rcost+r°)* = >> r*P (cos 2), (A > 0). 
n=0 


For \ = 0 we set 

P‘ (cos t) = cos nt = T,(cos t). 
Our result (1.3) extends to S, (m = 1, 2, ---) as follows. The most general 
element of $(S,,) is 


(1.5) g(t) = © a,P2(cos t), (\ = Hm — 1)), 


provided all a, 2 0 and that we have convergence for t = 0. This will follow 
readily from classical properties of ultraspherical polynomials (§3). 

An obvious property of the class $(M) is as follows. If M C N, then 
$(M) > P(N). As we may assume 

Si Cc S: a = 3 C--- © &., 
it follows that 
B(S:) D B(S2) D--- D PB(Sn) D--- D P(S,). 

In fact, BCS.) is identical with the intersection of all classes B(S,) (m =1, 
2,---). Since cos ¢ e $(S,,) for all m, we have cos ¢ e B(S,) and, by the closure 
property II, also (cos t)" « B(S,). The closure properties I and III show that 


(1.6) g(t) = > a,(cos t)" e B(S.), 
n=0 


provided a, = 0 and > a, converges. It will be shown that the functions 
g(t) of the form (1.6) exhaust the class B(S ,,)(§4).’ 


2 This result establishes the converse of Problem 37 of Pélya and Szegé, [5], p. 107. This 
converse may be stated as follows. Let F(x), (—1 S x S 1), be a continuous function en- 


joying the following property. If the quadratic form > an zize, (—1 S aa S1;¢ = 1, >>>, 
n;n = 2, 3, ---), is positive, the form >> F (ain)zi x should also be positive. Then F(x) is 
T 


necessarily of the form 


F(z) = > a,2’, (a,20,-1S281). 
0 


This result is implied by the fact that every element of B(S,) is of the form (1.6). In- 
deed, it suffices to assume a); = do. = --- = Gnn = 1 and to notice (i) that then ax, = cos 
(pipe), Where pi, --+, Pn are n appropriate points of S,_1, (ii) that the assumptions of our 
proposition imply that F (cos ¢) is positive definite in S,. 

A second application of our results is as follows. The classical expansion of (cos t)" 
in terms of Legendre polynomials has positive coefficients. The reason for this is now 
obvious in view of the relations (cos ¢)" « B(S,) C $(S2) and formula (1.3). We may pre- 
dict in the same way that the expansion of P,, (cos t) in terms of the polynomials P\”” 
(cos t), (m = 0,1, --- ;0 S uw < X), will likewise have positive coefficients, provided 2y, 


2\ are integers. 








98 I. J. SCHOENBERG 


The fifth and last section of this paper is devoted to problems of isometric 
imbedding in Hilbert space. We determine all metric transforms of the sphere 
T,, which can be imbedded isometrically in H. 


2. Some classical results concerning ultraspherical developments. We re- 
turn to the spherical shell S,, , of Em4:, defined by the equation 2° + 2} + -- 
+2, = 1. If p is an arbitrary point of S,, and 0 S r < 1, let p, be that point 


on the radius op such that op, = r. Furthermore, let F(p) be a real and con- 
tinuous point function defined in S,, and let’ 


1 l1-r : 
— Me) = Wm Js_ (1 — 2r cos pp’ + Fyn PP )dey 





where wm = 2n\"*? Tr "[4(m + 1)] is the “area’’ of S,, and pp’ denotes a spherical 
distance. For m = 2, (2.1) reduces to the ordinary Poisson integral which 
solves the boundary value problem for harmonic functions inside the sphere. 
A classical result concerning the Poisson integral (2.1) is the limiting relation 


(2.2) lim F(p,) = F(p). 


Differentiating (1.4) with respect to r and setting t = pp’, we get 





1 _ 2 
(2.3) TWes ay Faroe = pal oa 1)r" P®(cos pp’), (0 <r < 1). 


Let \ = 4(m — 1). Multiplying (2.3) by F(p’) dw,-/wm and integrating both 


sides over S,, we now get, in view of (2.1), the development* 


(2.4) F(p,) = bb : as A hf F(p’)P2(cos pp’) dw,’, (A = 34(m — 1)). 


n=() 


The ultraspherical development of F(p) is now given by 
A / 
(2.5) F(p) ~ x "ae [ F(p’)P&’ (cos pp’) day. 


Our relation (2.2) expresses the well-known fact that the ultraspherical expan- 
sion (2.5) is Abel-summable at every point p of S,, to the sum F(p). 
We shall now derive the special form of the expansion (2.5) for the case of a 


3 See [1], p. 198. 
4 See [1], formula (18), p. 207. 








n= 








POSITIVE DEFINITE FUNCTIONS ON SPHERES 99 


zonal function F(p). For this purpose we express the cooordinates of the point 
p = (x, %1,°** , 2m) in terms of polar coordinates as follows: 


x = cos 8, 
2, = sin 6 cos 6, 


X2 = sin 6 sin 4; cos &, 
(2.6) 


Im-2 = sin 6 sin 6; «++ COS Om_2, 
Im—1 = sin O sin & --- sin On_2 Cos ¢, 05 657,05¢ S&S 2n), 
Im = sin Osin & «++ Sin On sin ¢, (vy = 0,1, +--+,m — 2), 
and consider the special case 
F(p) = f(cos 6). 
The integral on the right hand side of (2.5) now becomes 
l= / A, | [F se0s 6’)P(cos pp’) 


-sin” @’ sin” 6; +++ sin On. d0’ +++ dOn_s dq’. 


(2.7) 


Integrating first with respect to the variables 61, °**, Ono, @, we get 


(2.8) I, = [ f(cos 6’)J, sin” 6’ dé’, 
0 
where we set 
(2.9) J,= / vee / [ P®(cos pp’) sin” 6; «++ sin On~2 d0; +++ dOm-» dd’. 
0 0 
This integral is now readily reducible to a simple integral as follows. Consider 
the two points p; and p; of polar coordinates 
= (3x, Ra**s Om—2 » >), Pi mie (3m, 6; , —— On 2» $"), 
both lying in the unit sphere S,,_, defined by 6 = 3x. Obviously 
(2.10) cos pp’ = cos 6 cos 6’ + sin @ sin 6 cos pip; . 


Now we notice that J, as given by (2.9) amounts to an integration of P® (cos pp’) 
over Sm-1. If we take in S,,; a new system of polar coordinates ({ = pip: , 
fi, °°*, &m-s, ¥), of pole p,, we obtain 


Jn = | | [ P?’(cos pp’) sin” ¢ sin™™* {1 +++ sin fms df +++ dfm—s dy. 
0 0 


Pe RAT 


DER TENET: an 








100 I. J. SCHOENBERG 


Since cos pp’ depends only on ¢, as shown by (2.10), the remaining integrations 
may be carried out leading to the expression 


J.= m2 | P® (cos pp’) sin” ¢ dé. 
0 


Now, since’ 
* . 2h— T(A)T(S)P(m + 1) (2A) paw ») 
P® (cos pp’) sin™™ ¢ dt = P’ (cos 6)P2’ (cos 6’), 
I pp TE Day Pe (e08 #P2 (cos 6”) 
J, is explicitly computed. Substituting its value into (2.8) we find that the 
general expansion (2.5) reduces to® 


) i (n + a T(n + 1)P(2ad) P (cos 6) 





fleos 8) ~ 2, 5a FPG) Pin + 2d) 


(2.11) : 
[ P’(cos 6’) f(cos 6’) sin” 6’ dé’. 
0 


This expansion is, of course, also Abel-summable. It will be applied in the 
next section to functions f(cos @) which are positive definite in S,, 


3. On positive definite functions in S,,. Let f(x) be real and continuous in 
the interval —1 S x S 1 and such that f(cos ¢) is p. d. in S,. As stated in 


the introduction this amounts to the inequality 
(3.1) Dd fleos pipe)xirz, = 0, (pie Sm, 2: real). 
1 


According to W. H. Young this quadratic form requirement is equivalent to 


the integral inequality’ 
(3.2) I(h) = [ [ f(cos pp’ )h(p)h(p’) dw, dw, = 0 


for an arbitrary continuous point function h(p) in S,, . 
A consequence of this re-statement is as follows. If h(p) = 1, (3.2) becomes 


(1) = [ \/ f(cos edits Vis = im | f(cos pp’) dw, 


since the last integral is clearly independent of p. Since J(1) = 0, we have 


proved that 


(3.3) [ f(cos pp’) dw» = 0, 


provided f(cos ¢) is p. d. in S,, . 

5 See [4], formula (1), p. 203. 

6 See [4], p. 198. We a below the fact that the expansion (2.11) is a special case of 
(2.5). Our derivation of (2.11) from (2.5) is well known as an analogue of a classical re- 
duction. However, being unable to find it in the literature, I developed it here in detail. 

7See [8]. 





ns 


he 


6”. 


he 


es 


ve 


of 


re- 
il. 











POSITIVE DEFINITE FUNCTIONS ON SPHERES 101 


A further result which we need is as follows. The ultraspherical polynomials 
P% (cos t), (n - 0, 1, P< ;A _ 3(m cies 1)), 


are all p. d. in Sm . 

This statement is trivial for m = 1 (A = 0) in view of the cosine addition 
formula. Let m = 2. In order to prove that P® (cos t) is p. d. in Sp we 
proceed by induction assuming P9~’(cos t) to be p. d. in Sn. Suppose 
pie Sm (i = 1,---, N) and associate with the point p; a point p;, on the 
“equator” S,,, of equation 6 = 37, such that the last m — 1 polar coordinates 
6:,°-*,@0f both points p; and p; agree. As remarked before we have 


. i . rr 
cos pip, = cos 6 cos & + sin 6' sin & cos p;p, . 
~ ° P -, 8 
By the addition formula for ultraspherical polynomials we may write 


(A—}) 


P (cos pips) = > Cn,r,e Px*(cos 6°)P%"*(cos 6°)P2” (cos p; pi), 


’ . . d) oe, 
where P®* are the real polynomials associated to P® and c,,,. are positive 


coefficients whose values are here of no concern. But then 


N n N 
(3.4) P?’(cos Pi Pu déi &; = > Cn.d,8 > P°2-* (cos DP: Pini nk = 0, 
i,k=l s=0 i,k=l 
where 7; = P**(cos 6')é;. |The expression (3.4) is indeed non-negative since 


P°~ (cos t) was assumed to be p. d. in S,;. This completes our proof. 

We may now establish the following theorem.” 

THEOREM 1. A necessary and sufficient condition in order that f(cos @) be 
positive definite in S,, ts that the ultraspherical expansion (2.11) have non-negative 
coefficients in which case the series (2.11) converges throughout 0 < @ S m abso- 
lutely and uniformly to the sum f(cos 0). The most general f(cos 0) which is 
p. d. in S», is therefore given by the expansion 


(3.5) f(cos 6) = > a, P2’(cos 8), (a, = 0,’ = 3(m — 1)), 


n=0 
provided the series converges for @ = 0. 


Indeed, let f(cos @) be p. d. in S,. The coefficient of P{’(cos 6) in (2.11) 
may be written as 


[ P®’(cos 6’) f(cos 6’)sin” 6’ dé’ = ais / P’(cos ap’) f(cos ap’) dw,’ , 
0 Wm—1 “Sm 


where a is the point of S,, of coordinates x = 1, 7 = --: = 2 = 0. But 
then the last integral is visibly positive for the following reason. Since 
P® (cos é) and f(cos t) are both p. d. in S, , their product also enjoys this 


8 See [4], formula (6), p. 182. 
® Recently Bochner has extended this theorem by means of the theory of generalized 
spherical harmonics of E. Cartan and H. Weyl. See [2], §§III and IV. 








102 I. J. SCHOENBERG 


property. Now (3.3) shows that all coefficients of (2.11) are non-negative. 
We may therefore re-write (2.11) in the form 


(3.6) f(cos 6) ~ > a, P’(cos 8), (a, 2 0). 


n=() 


On the other hand, we know this series to be Abel-summable for all 6, hence, 
in particular, for @ = 0. Thus 


k k ~ 
DX an| P2*(cos @)| S Dy anP2(1) S lim DY) ayr* P21) = f(1). 
n=( n=0 rl 0 
This shows that the series (3.6) is absolutely and uniformly convergent for all 6 

(0 S @ S =), hence convergent to its Abel-sum which is f(cos 6). 

The converse to the effect that the convergent series (3.5) defines a p. d. 
fi(cos @) is clear. Indeed, f(cos 6) is obviously continuous because the series 
(3.5) must converge uniformly. As f(cos @) thus appears as the continuous 
limit of a sequence of p. d. functions, it is p. d. itself. 


4. On positive definite tunctions in S,. If f(cos @) is p.d. in S,, it is also 
p. d. in S,. By Theorem 1 we are therefore assured to have an expansion 
with non-negative coefficients 


(4.1) f(cos 0) = >> a,(A)P2(cos 8), (a,(A) 2 0,0 S60 S =), 
which is valid for all values of \ of the form X = 3(m — 1), (m = 1, 2,3, --- ). 
Setting 
P® (cos 6) 
A n 

(4.2) p,,(cos 6) = “PH ’ 
we have a similar expansion 
(4.3) f(cos 0) = >> ba(d)p (cos 8), (6.44) 20,0 S68 2). 

n=0 
Since (see [4], p. 95) 
(4.4) lim p(cos #) = cos” 8, 


A009 
we should expect to derive from (4.3), by letting \ — ~, the following theorem. 
THEoREM 2. A function f(cos 0) which is positive definite in S,, is necessarily 
of the form 


(4.5) f(cos 6) = p> a, cos” 8, (a, = 0). 


Now (4.4) obviously holds uniformly in 6. This fact, however, will not suffice 
to prove Theorem 2 due to our scant information concerning the coefficients 





ce 











POSITIVE DEFINITE FUNCTIONS ON SPHERES 103 


b,(A) of (4.3). What we shall actually need is that, for a fixed value of 6, (4.4) 
holds uniformly for all n. This we state as a separate lemma. 


Lemma 1. Let 0 be such thatO0 <@0< x. If € > 0 is arbitrarily small, we have 
(4.6) | p\(cos 0) — cos" 0| < e for alln = 0,1, 2,---, 


provided \ > L(8@, «). 


10 ~ : : 11 
Proof of the lemma.” From the known integral representation 


P®’ (cos 6) = [ (cos 6 + isin 6 cos ¢)" sin” ¢ dé, 
0 

we have 
(4.7) i = p(cos 6) — cos" 6 = I F,(0, ¢) sin” ode / [ sin” ¢ d¢, 
where 
(4.8) F,,(6,¢) = (cos 6 + isin 6 cos ¢)" — cos” 8. 
Evidently 
(4.9) | F.(6, @) | S 2. 
Now we choose a 6 = 6(@) such that 
(4.10) 0 <6 < zn, cos’ 6 + sin’ 6 sin’ 6 < 1. 
From (4.7), (4.8) and (4.9), we have 


§x—6 ® 
4 I sin” ¢ do /{ sin” odo 
0 0 


§x+é r 
+ / | F.(0, ¢) | sin” ¢d@ / [ sin” ¢ d¢ 
ir—8 0 


4r—5 Cd 
4 I sin” odo / [ sin” odo 


+ (cos? 6 + sin’ 6 sin? 5) + | cos 0 ¢ 


| A» | 


IA 


IIA 


Now let ¢ be given. Let mp = (6, €) be such that n > mo implies 
(cos’ 6 + sin’ @ sin’ 5)*" + | cos @|" < 4e. 


10 This elegant proof of Lemma 1 is due to Professor Szegé. My original proof was 
more complicated and based on a new estimate which I shall state here since it might 
possibly be used elsewhere. For 0 < 6 < 4x, A 2 5, and n = 1, 2, 3, --- the following in- 


equality holds 
1 ee\im 41 29 
| P (eos 0) | = PO ayd( Lt eoste\” 5 4 At oowee\ 
2 n sin (20) } 


11 See [4], formulas (9) and (11), pp. 193-194. 








104 I. J. SCHOENBERG 


The existence of such an m is assured by (4.10). Furthermore, suppose \y = 
Ao(8, €) is such that A > Ao implies 


4x—5 ® 
1/ sin” eae / [ sin” odo < }e. 
0 () 


Hence | A* | < «, provided \ > do(8, €), m > mo(O, €). On the other hand, from 
(4.4) we get that | AX | < ¢ for n = 0,1, --- , m(8, 6), provided \ > A,(8, e). 
Now the lemma is proved if we take as L(@, e) the larger of the two numbers 
Xo and ), . 

Proof of Theorem 2. Our starting point is the expansion (4.3). Since 
p.(1) = 1 and hence 


f(1) = > bala), 


n= 


we see that the coefficients b,(A) are uniformly bounded for all n and the range 
of values of }. By Cantor’s diagonal process we may find a subsequence 
4, — « such that 


(4.11) lim b,(A,) = an 2 0, (n = 0,1, 2, ---). 


y-*oo 
Let @ have a fixed value between 0 and x. Let us write the relation (4.3) in 
the form 


f(cos 6) = > b,(X,) cos” 6 + > b,(d,) [p»?(cos 8) — cos” 6]. 
Since by the lemma we have 
¥ ba(4)[P'(cos 6) — cos" 6}| < « X bal) = (1), 
provided X, is sufficiently large, we may write 
(4.12) f(cos 6) = > b,(A,) cos" 6 + oa, 
where | o | < ¢f(1) for sufficiently large \,. However, the series 
(4.13) }» b,(X,) cos” 6 


converges uniformly with respect to the variable \, because it is majorized by 
the convergent series with constant terms 


> sa) | cos @ |”. 


But now the limiting relations (4.11) imply that the series (4.13) will tend to 
p a, cos" 6 as 4, — ©. Thus (4.12) may be written as 


f(cos 6) = >> a, cos” 6 + o’, 
: 0 





nce 


by 











POSITIVE DEFINITE FUNCTIONS ON SPHERES 105 * 
where | o’ | S e f(1). Now letting e — 0, we conclude that o’ = 0 and hence that 
(4.14) f(cos @) = > a, cos” 6, (a, 2 0). 

0 


There remains to show that (4.14) is also valid for 6 = 0 and 6 = x. This 
last point, however, is readily settled. Indeed, (4.14) implies the convergence 
of the series >> a, by letting @ > 0. Now the continuity of both sides of the 
relation (4.14) at both ends of the interval 0 S @ S x implies its validity through- 
out this closed interval. 


5. On metric transforms of S,, which are imbeddable in Hilbert space. Sup- 
pose F(t), (0 S$ t S m), is continuous, F(0) = 0, F(t) 2 O0OifO <tS-. Letus 
remetrize the metric space S ,, from the original distance function pg to the new 
distance function F(pq). The new semi-metric space thus obtained is called 
the metric transform of S, by the function F(t) and denoted by the symbol 
F(S,). We shall prove the following theorem.” 

THEOREM 3. The metric transform F(S ,,) is isometrically imbeddable in H if 
and only if 


(5.1) F(t) = > a, (1 — cos" 2), (a, 2>0,0StS7). 
n=l 


We need the following two lemmas. 

LemMa 2. The metric transform F(S,) ts imbeddable in H tf and only if the 
function exp {—\F’(t)} is positive definite in S,, for all X > 0. 

This lemma is known to be valid not only for S,, but for any separable metric 
space ((6], Theorem 1, p. 527). 

Lemma 3. Let K denote the class of functions $(x) of the form 


co) 


(5.2) o(xz) = Do a,(1 — 2"), (a, 20, -13298 1). 
If {dn(x)} is a sequence of functions of this class and 
(5.3) lim n(x) = ¢o(z), (-1$281), 


n~?oO 


where oo(x) is continuous in —1 S x S 1, then also ¢o(x) belongs to the class K. 


12 A similar theorem concerning S,», is as follows. F(S,,) is imbeddable in H if and only if 


FXt) = >~ an(1 — pa(cost)), (A = }(m—1),05tS 7). 


n=l 


However, this theorem is implied by a general result of Bochner ((2], Theorem 3) con- 
cerning a certain type of compact spaces, of which the simplest example is the finite dimen- 
sional sphere S,,. Our method of proving Theorem 3 was used before in the case of Euclidean 
and Hilbert spaces ((7], §5, [3], Part III). 








106 I. J. SCHOENBERG 
Proof of Lemma 3. We prove first the following statement. Suppose 
@ 
V(x) = > 2’, ( 2 0, -l1<2< 1), 
0 
such that 
1—0 
[ ¥(x) dx exists. 
0 
Then the class K is identical with the class K, of functions of the form 


(5.4) ¢(z) = / V(x) dz, (-l1s2z< 1), ¢(1) = 0. 


Proof that K C K,. From ¢(z) = >> a,(1 — 2”) we get —¢/(z) = 
C) 1 
Zz. na,t" | = ¥(x) for —1 < x < 1; hence 

1 


¢(x) — ¢o(1 —€) = / ; W(x) dx. 


Letting « — 0 we get (5.4); hence ¢(x) ¢ K,. 
Proof that K, C K. gosto ¢(x) ¢ K,, and hence is of the form (5.4). We 


have 


oz) - ¢-2 = | 


z 


"i (= «2’) dz = ) ae (a — —)” — 2"*"). 


1 


Letting « — 0 we get ¢(x) = p ot 1 — 2’ *")e,/(v + 1); hence ¢(z) € K. 


0 


Returning to the assumptions of Lemma 3, let 
galt) =D an(l — 2’) 
If x = re” (0 S r < 1), we have 
= dia,(1 — re”) = Le an (1 —1)+ Lo dn r"(1 — e*") 
hence 


(5.5) | a(x) | < dar) +2 p> eof? 


But r” < A-(1 — 7’) for all vy = 1, 2, --- , fora certain A = A(r). Thus (5.5) 
implies that 
| dn(xz) | S dn(r) + 2A(r)on(r) 


for |x| = r and therefore also if || < r. From (5.3) we now conclude that 
the ¢,(z) are uniformly bounded in every circle |x| S r (r < 1). By the 





: 1), 


We 


hat 
the 








POSITIVE DEFINITE FUNCTIONS ON SPHERES 107 


Vitali-Porter convergence theorem, (5.3) implies that the ¢,(x) converge uni- 
formly inside | x | < 1 and that ¢o(x) is therefore analytic and regular in | z | < 1. 
Since ¢,(x) « Ki, we know that —¢,(z) has all its derivatives non-negative at 
the origin. From ¢{ — ¢§” (uniformly inside | z | < 1) we conclude that also 
—¢,(x) has all derivatives non-negative at the origin; hence 





vor) = — doz) = ym Gf, (c, 20, -l1<2< 1). 


This also implies 
l—e 


do(r) — doll — €) = vo(x) dx. 


Since ¢o(1 — 0) = g@o(1) = O we get, by letting e — 0, 
1 


oo(z) = ‘ Yo(x) dz. 


Hence, ¢o(x) « Ki; = K and our proof is completed. 
Proof of Theorem 3. In order to prove the direct part of the theorem, let 
F(t) be defined by (5.1). Now, for A > 0, 


exp {—AF"(t)} = exp {- \ > a,(1 — cos" 1) } 
0 


j 


oo ( c) 
= exp { — A ¥ a,| - exp i do an cos” ‘ 


/ 


evidently admits an expansion of the form 
(5.6) exp {— AF*(t)} = Do bald) cos” ¢, (0st, b,(A) 20), 
0 


and is therefore p. d. in S,, by Theorem 2. 

The converse part is proved as follows. Let F(S,) be imbeddable in H. 
By Lemma 2 and Theorem 2 the expansion (5.6) is valid for all \ > 0 and 
therefore 


#(t,) = : (1 — exp {—AF*(t)}) = > 7b, (A)(1 — cos” t). 


However, #(t, \) > F’(t) as \ > 0 and (f, d) is a function of class K if regarded 
as a function of the variable z = cost. As the limit function F*(t) is continuous 
throughout 0 < ¢ S 7, Lemma 3 implies that F’(é) is of the desired form (5.1). 

Remarks. 1. If in (5.1) we set a; = 2, a2 = a3 = --- = O, we get Fi(t) = 
2(1 — cos t) = 4 sin’(3t); hence 


F,(t) = 2 sin }t. 


By Theorem 3, F:(S,) is imbeddable in H. This is a trivial result, for Fi(S ,.) 
is the sphere S ,, with the spherical distance pq changed to the Euclidean chord- 
distance 2 sin (4pq), which is already imbedded in H. 





108 I. J. SCHOENBERG 


By taking only the second term of the expansion (5.1), we get F2(t) = | 
1 — cos’ ¢ = sin’ t; hence 


F,(t) = sin ¢. 


It should be noticed that this remetrization identifies diametrically opposite 
points on the sphere S,. This identification occurs for a F(t) defined by (5.1) 
if and only if a, = 0 whenever n is odd. 

2. Let F(S,) be imbeddable in H. We want to show that F(.S,) may also 
be placed in some appropriate spherical shell of H. Indeed, let us adjoin to 
the space F(S,) a new point A such that, if pe F(S,), its distance to A is 
d(A,p) =r. We want to determine the constant r so that the space F(S,) + A 
can be imbedded in H. This requires (see [3], p. 229) that 


N 


DD {+r — F'(pips)} rire 2 0 
i,k=l 
for arbitrary points p; , --- , py of S,. By (5.1) this amounts to 
N ( ; °o 
> a — > a,(1 — cos” pi m)\ rir, = O. 
i,k=l n=l ) 


If we set 
ic ¥ 
(5.7) r= (3 ¥ a.) 
2 “1 


our last inequality reduces to 


ea N 
Dd an cos” (pi pe) xi 2x 2 O, 
nal igkenl 


which is correct since cos" ¢ is p.d. in S,. We state this result as a 


Corotiary. The metric transform F(S,) of Theorem 3, if imbedded iso- 
metrically in H, will lie on a spherical shell of H of radius (5.7). 


BIBLIOGRAPHY 


1. P. AppELL et J. Kampf ve Fériet, Fonctions hypergéométriques et hypersphériques. Poly- 
nomes d’Hermite, Paris, 1926. 

2. S. Bocuner, Hilbert distances and positive definite functions, Annals of Mathematics, 
vol. 42(1941), pp. 647-656. 

3. JoHN von NEUMANN AND I. J. SCHOENBERG, Fourier integrals and metric geometry, Trans- 
actions of the American Mathematical Society, vol. 50(1941), pp. 226-251. 

4. N. Nretsen, Théorie des fonctions métasphériques, Paris, 1911. 

5. G. Pétya unp G. Szeaé, Aufgaben und Lehrsdtze aus der Analysis, vol. 2, Berlin, 1925. 

6. I. J. SchornserG, Metric spaces and positive definite functions, Transactions of the 
American Mathematical Society, vol. 44(1938), pp. 522-536. 

7. I. J. ScHoenseERG, Metric spaces and completely monotone functions, Annals of Mathe- 
matics, vol. 39(1938), pp. 811-841. 

8. W. H. Youna, A note on a class of symmetric functions and on a theorem required in the 
theory of integral equations, Messenger of Mathematics, vol. 40(1910), pp. 37-43. 


UNIVERSITY OF PENNSYLVANIA. 











osite 
(5.1) 


also 
n to 
A is 


+A 


180- 








THE ANALYTIC PROLONGATION OF A MINIMAL SURFACE 
By E. F. BecKenBAacH 


1. Introduction. A classical theorem is the following.’ 


THEeorEM A, If a minimal surface S cuts a plane II orthogonally, then S is 
symmetric with respect to Tl. 


We shall establish the following generalization of Theorem A. 


THEOREM B. If a minimal surface S is bounded in part by an arc C of a curve 
that lies in a plane I, and if S approaches Il orthogonally, then S can be continued 
analytically as a minimal surface across II and the extended surface is symmetric 
with respect to II. 


A similar generalization of the following classical result has been given by 
J. Douglas:’ if a minimal surface contains a straight line in its interior, then the 
straight line must be an axis of symmetry of the surface. 

The proofs of Theorem A which have been given depend essentially on the 
fact that the plane curve is an interior curve on S. To prove Theorem B, it 
would be sufficient, in virtue of Theorem A, to prove that S can be continued 
analytically across II. Actually, though, we establish both the possibility of 
analytic continuation and the symmetry at the same time, so that in particular 
a proof of Theorem A is included in our proof of Theorem B. 

It is to be noted (see (2.3)) that we do not assume the are C to be analytic. 
We prescribe the behavior of only one of the coordinate components, z(u, v), 
as the parameter point approaches an arbitrary point on a given segment ab of 
the boundary of the domain of definition D. Indeed we are assuming not even 
that the part of the boundary of S in question is an are of curve but only that 
the boundary lies in a plane. 

It follows as a consequence of Theorem B, however, that the boundary of S 
on II must necessarily be an analytic arc. 

It is further to be noted (see (2.4)) that we do not assume that the normal 
to S approaches a definite position as the parameter point approaches a fixed 
point on ab but only that the component of the normal perpendicular to I 


approaches zero. 


2. Analytic formulation. We shall denote by D the upper half of the 
u, v-plane, v > 0; by D’ the lower half, v < 0; and by ab a fixed open segment 


Received August 26, 1941. 

! See, for instance, the author’s paper, Minimal surfaces in Euclidean n-space, American 
Journal of Mathematics, vol. 55(1933), pp. 458-468. 

2 J. Douglas, The analytic prolongation of a minimal surface over a rectilinear segment 
of its boundary, Duke Mathematical Journal, vol. 5(1939), pp. 21-29; see also Proc. Nat. 
Acad. Sci. U. S. A., vol. 26(1940), pp. 215-221. 


109 





110 E. F. BECKENBACH 


a <u <b, v = 0 of the w-axis. For a surface S with coordinate functions 
given by 
xz = 2(u, v), y = y(u, v), z= 2(u, v), 
we shall denote the direction cosines of the normal to S by X(u, v), Y(u, v), 
Z(u, v). 
Suppose the functions x(u, v), y(u, v), z(u, v) have the following properties: 
(2.1) they are harmonic in D, 
Tuy + lw = 0, Yuu + You = 0, Zuu + Zn = 0; 
(2.2) they satisfy throughout D the relations 
tty te =tetye te, Luly t+ Yur + 2ue = 0; 
(2.3) for all (uw, 0) on ab, ie., fora < uw <b, v = 0, 


lim 2(u, v) = 0; 
(u,v) > (ug,+0) 


(2.4) for all (uo, 0) on ab, i.e., fora < w < b,v = 0, 
lim Z(u, v) = 0. 


(u,v) > ( ug,+0) 
We shall show that under these hypotheses, the functions z(u, v), y(u, v), 
z(u, v) can be extended analytically across ab into D’, in accordance with 
(1) a(u, —v) = z(u,v), y(u, —v) = y(u,v), 2(u, —v) = —2(u, v). 


3. Proof. By (2.3) and the principle of symmetry for harmonic functions, 
z(u, v) can be continued as a harmonic function across ab into D’ in accordance 


with the third formula in (1). 
Since z(u, 0) = 0 on ab, it follows that on ab we have 


Zz, = 0. 
Further, z, is continuous on ab. Hence, by (2.2), for (wo , 0) on ab, 


(2) lim (2. + yi — 2 — yt) = 24(uo, 0), 


(u,v) —>( ug,+0) 


(3) lim (tuto + YuYo) = 0; 


(u,v) —>( wo,+0) 


and by (2.4), 


, TuYvo — LeYu 
4) ae... er ey. 
From (3), (4) and the identity 
(tuty + Yue) + (tuo — ZeYu) = (au + yi)(ze + 2), 
we have 


(5) (aa + yt (x? + y2) = (u,v) + Pu, vei + yd + 2id(ai + oF + 22), 











ions 


ns, 
nce 


‘Ss 
srw 
— 








ANALYTIC PROLONGATION OF MINIMAL SURFACE 111 


where 
lim n(u, v) = 0, lim t(u, v) = 0. 


(u,v) > (ug,+0) (u,v) > (ug,+0) 
From (5) and the continuity of z, and z, on ab, it follows that we have 


(6) lim (2. + yi + yi) = 0 


(u,v) > (4 9,+0) 
Now (2) and (6) imply 
lim (x, + ys)'= 0; 


(u,v) (ug, +0) 


that is, 


(7) lim lt, = lim y» = 0. 
(u,v) —>(uo,+0) (u,v) > (ug,+0) 


From (7) it follows that the harmonic functions z,(u, v) and y,(u, v) can be 
continued as harmonic functions across ab in accordance with 


(8) 2,(u, —v) = —2,(u, v), y(u, —v) = —y,(u, v). 


Partial integration of the functions in (8) with respect to v yields the analytic 
continuation of x(u, v), y(u, v) in accordance with the first two formulas in (1). 


4. Remark (added in proof). In the theorem of Douglas, it is shown that 
conditions (2.1), (2.2) and 


lim x(u, v) = 0, lim y(u, v) = 0 
(9) (u,v) > (ug,+0) (u,v) > (ug,+0) 


imply that x(u, v), y(u, v), z(u, v) can be extended analytically across ab into D’ 
in accordance with 

(10) x(u, —v) _ —x(u, v), y(u, —v) - —y(u, v), 2(u, —v) = 2(u,v). 

A simple proof follows. The first two equations in (10) follow from (9) and 
the principle of symmetry, so that on ab the functions 2, , yu, %», Y» are con- 
tinuous and 2 = y. = 0. By (2.2), then, 


. 2 2 2 2 . 
lim (zu — 2r) = ty(uo, 0) + ys(uo, 0) 2 0, lim Zuzy = 0; 
(u,v) > (w9,+0) (u,v) > ( ug,+0) 


consequently, 


lim z, = 0, 
(u,v) > (u9,+0) 


so that z, can be extended harmonically across ab in accordance with 
Z(u, —v) = —2,(u, v). 


Partial integration yields the third equation in (10). 


THE UNIVERSITY OF MICHIGAN, 








ADDITIVE FUNCTIONS AND ALMOST PERIODICITY 
By Puitre HARTMAN AND AUREL WINTNER 


1. Let f = f(n) be a function defined for n = 1, 2,---. Its mean-value, 
M(f), is"usually defined by 


f(1) + °° $I _, uy) as n— o 


n 





provided that this limit exists (this proviso should include that M(f) # +), 
However, in certain connections, this definition of a mean-value turns out to 
be too vague. Correspondingly, the Disquisitiones Arithmeticae consider a more 
restrictive definition, which today can be formulated as follows: 


fim + 1) + --- + flim +9) _, 997) as n— 


n 





uniformly for all m(= 1, 2,---). Let M(f) then be denoted also by M*(f). 

Thus M(f) exists and equals M*(f) whenever M*(f) exists. But M*(f) need 
not exist when M(f) exists. The situation is illustrated by the following ob- 
servations: 

(i) If f(n) = lorf(n) = n' according as n is not or is a perfect square, then 
M(f) exists. Hence lim sup f(n) = « does not preclude the existence of M(f) 
for anf. But this situation is changed if M is replaced by M*, since | f(n) | < 
const. is a necessary condition for the existence of M*(f). In fact, the existence 
of M*(f) means that, if « > 0 is arbitrary and if N, is suitably chosen, then 

m+n—-1 
p> f(k) — nM*(f)| < en whenever n2 N,, 
k=m 
where m is arbitrary. Hence, if ¢« = 1 and N = N,, then 


m+N m+N 


> fk) — (N+ DM*(f) < N +1, 2 Se) — NM*(f) <N 


k=m 
for every m. Consequently, for every m, 
|f(m) — M*(f)| < N+14N. 


Since M*(f) and N are independent of m, it follows that f is bounded. 

(ii) Birkhoff’s ergodic theorem states that, under his assumptions, the se- 
quence of images of an arbitrary L-integrable function possesses an M-mean 
almost everywhere. But the theorem becomes false if M is replaced by M*. 
This is clear from (i) since the L-integrable function can be chosen rather un- 


Received September 30, 1941. 
1C. F. Gauss, Werke, vol. 1, pp. 362-366. 
112 





lue, 


0), 
t to 
nore 











ADDITIVE FUNCTIONS AND ALMOST PERIODICITY 113 


bounded. Since Borel’s law of large numbers is implied by Birkhoff’s ergodic 
theorem, it will follow from (iii) below that there exist ergodic flows (and, as a 
matter of fact? mixtures) for which M cannot be replaced by M* even in case 
of bounded functions. 

(iii) If f,(n), where 0 < t < 1, denotes the n-th digit in the (infinite) dyadic 
representation of t, then, according to Borel, M(f,) exists for almost all ¢ (and 
equals } for almost all ?). On the other hand, 1/*(f,) exists only when ¢ is chosen 
in a certain set of measure 0. This is implied by the fact that the dyadic repre- 
sentation of almost every ¢ contains arbitrarily long stretches of both digits, it 
being understood that every stretch contains only one of the digits. 

(iv) Let f% denote the ¢-value defined by f,(n) = 3 + 4A(n), where J is Liou- 
ville’s factor (that is, \(n) = (—1)', if n has exactly | prime factors, which need 
not be distinct). Then & is “normal” in the sense of (iii). In other words, 
M(f,,) exists (= }) but /*(f,,) does not exist. In fact, the prime number theorem 
is known to be equivalent to the assertion that M(A) exists (= 0) and the 
non-existence of 1/*(\) follows from other properties of the distribution of the 
primes. (If e > 0 is arbitrary, Riemann’s hypothesis is known to be equivalent 
to (1) + --- + A(n) = O(n‘) and can therefore be expressed by saying 
that t is “normal” in the sense of the standard estimate f,(1) + --- + fi(n) = 
in + O(n’**) of the successive sums of the dyadic digits of almost every t.) 

(v) The requirement that the mean square deviation of an f = f(n) from 
suitable finite trigonometric sums in n be arbitrarily small is equivalent to the 
almost periodicity of f in the sense of Besicovitch (B’) or of Weyl (W’) according 
as the mean is defined as M or as M*. This is clear from the definitions in the 
first ease and is easily verified in the second case.” That the replacement of 
M by M* actually restricts the almost periodic class admitted, i.e., that not 
every f which is almost periodic (B’) is almost periodic (W’), is implied by the 
fundamental fact that the space (B’) is, but the space (W”) is not, complete 
with reference to the topology determined by the metric M(|/ fi — fo |’), 
M*(| fi: — fo |”) respectively. 

(vi) An unbounded f(n) can be almost periodic (B*). It cannot be almost 
periodic (W*). In fact, if an f(n) is almost periodic (W’), then M*(f) exists, 
and therefore | f(n) | < const., by (i). 


2. A function f = f(n) is called additive if f(mn2) = f(m) + f(me) whenever 
nm, and ne are relatively prime (in particular f(1) = 0 for every additive f). 
Thus, if p denotes a prime number and k a positive integer, an additive f is 
uniquely determined by an arbitrary double sequence {{c\}} and by the 
assignments 


Sip") = ef 


2H. Weyl, Integralgleichungen und fastperiodische Funktionen, Mathematische Annalen, 
vol. 97(1926), pp. 338-356 (end). It is understood that the functions f(t), --o <t< o, 
considered there are now replaced by functions of the positive integer n. Cf. also B. 
Jessen and A. Wintner, Distribution functions and the Riemann zeta function, Transactions 
of the American Mathematical Society, vol. 38(1935), pp. 48-88 (§11-§12). 





114 PHILIP HARTMAN AND AUREL WINTNER 


and 


(1) fin) = RIO), o (pe Xn). 


It was recently shown® that an additive f is almost periodic (B’*) if and only 
if both series 


5), pF se! 

p Pp’ p i=l Pp 
are convergent (the first of these series, which need not be absolutely convergent, 
being thought of as ordered according to increasing primes p). The main object 
of the present note is to show that an additive f is almost periodic (W’) if and 
only if it is bounded. For an explicit criterion, cf. the end of §5 below. 

Since the existence of M*(f) is necessary but not sufficient for the almost 

periodicity (W’) of an arbitrary f, it is clear from (i) that the italicized theorem 
may be formulated as follows. 


Turorem. If f(n) is additive, the trivial necessary condition | f(n) | < const. 
is sufficient not only for the existence of M*(f) but also for the almost periodicity 
(W’) of f(n); in particular, an additive f(n) is almost periodic (W*) whenever 
M*(f) exists. 

In view of (vi), only the sufficiency of this criterion needs a proof. Obviously, 
the theorem is purely arithmetical in nature; in fact, an arbitrary bounded f(n) 
not only fails to be almost periodic (W*) but does not even have a mean-value 


M(f). 


3. An arithmetical class of functions g(n) considered by Toeplitz‘ (for p = 2) 
may be defined by choosing an arbitrary prime number p and a sequence of 
values ¢), ¢:, -** , and then placing 


g(p") = 
and 
(2) g(n) = g(p'), if p* | n but p**’ +n, (k 2 0). 


Let such a function g of the positive integer n be called a p-function (belonging 
to the fixed prime p). 
It is known’ that a p-function possesses a mean-value M(g) if and only if 


=. g(p*) 
k=0 p* 


3 P. Erdés and A. Wintner, Additive functions and almost periodicity (B*), American 
Journal of Mathematics, vol. 62(1940), pp. 635-645. 

40. Toeplitz, Ein Beispiel zur Theorie der fastperiodischen Funktionen, Mathematische 
Annalen, vol. 98(1928), pp. 281-295. 

5 E. R. van Kampen and A. Wintner, On the almost periodic behavior of multiplicative 
number-theoretical functions, American Journal of Mathematics, vol. 62(1940), pp. 613- 
626 (Theorem I). 











0). 


ing 


y if 


can 
che 


tive 
513- 








ADDITIVE FUNCTIONS AND ALMOST PERIODICITY 115 


is a convergent series. It will now be shown that M*(g) exists if and only if the 
p-function g is bounded: 


(3) | g(n) | < const. for all n. 
According to (2), this is equivalent to 
(4) | g(p") | < const. for all k. 


The necessity of (3) follows from (i), §1. 
In the proof of the sufficiency of (3), let it be assumed, without loss of general- 
ity, that g is real-valued. For every pair n, m of positive integers, let a function 


o” (x), —2x <a < ~&, be defined by placing 


™ 1 
(5) e(z)=- DY 1 
N gli)<z 
m<ismtn 


so that @,”’ (x) is the relative frequency (“probability’’) of the inequality g(i) < z 
when the integer 7 is greater than m but not greater than m + n. Thus the 
function ¢;”” of x is monotone and such that 


(6) on (—%) =0, gn" (~) = 1. 
Furthermore, for every non-negative integer h, 

o a - 1 mrn y 

(7) [ #ae@ == Lowy. 


If ky = ki(x), ke = ke(x), --- , where ki < ke < --+ , denotes the sequence 
(finite or infinite) of those integers k = k(x) which satisfy the inequality 
g(p') < 2, it is readily verified from (2) and (5) that 


- 1 Tm+n m m+n m 
estes] -Le)-Eetl+ ped 


where [y] denotes the integral part of y and the summation index runs through 
all the subscripts of k; = k;(xz). Since this representation of ¢{"(x) obviously 
implies that, for every fixed N, 
1 1 
me) (1-2) ots 
\? p >» pi 


it follows that the limit relation, 





4N 
n 








1 2 
Foe Th 


m 1 1 
on” (x) =(1 = *) 2. Fi® as n— ©, 


F] 


holds uniformly for all m(= 1, 2,---). But the definition of the k;(x) shows 
that the expression on the right of this limit relation is identical with the mono- 
tone function ¢(z), —» < x < , defined by 


P/ o(p*)<z 


p* 








116 PHILIP HARTMAN AND AUREL WINTNER 


where the summation index, k, runs through those non-negative integers for 
which g(p*) < x. Accordingly, 
(9) on (x) > (2) as n> & 
holds uniformly for all m. It is understood that z is arbitrarily fixed in such a 
way as to be distinct from the values contained in the sequence of the dis- 
continuity points of the monotone function ¢(x). In other words, the arrow 
in (8) is meant in the sense of the theory of monotone functions.* 

All of this was independent of the assumption (3). Suppose now that (3) 
is satisfied. Then, by (5), 
| 0 if —«o < ZX < —const., 


il 


(10) on” (x) 
l if const. << x < ~; 


so that the actual domain of integration on the left of (7) is uniformly bounded 
in n and m together. It follows therefore from Helly’s theorem on term-by- 
term integration, that, since (9) holds uniformly in m, the limit relation, 


(11) / x" do” (x) + | z' d(x) as n> &, 

Lao L- a0 
holds uniformly in m for every fixed h. Hence it is seen from (7) that M*(g") 
exists for every h. Since the existence of M*(g) follows by choosing h = 1, 


the proof is complete. 
On choosing h = 1 in (7) and (11) and substituting (8) in the integral on the 
right of (11), one obtains for M*(g) the explicit representation 


1 E-) g(p*) 
12 M*(q) = (1 _ *) : 
(12) 9) : > 7 


which will be needed in §4. 


4. It is known that, if g = g(n) is a p-function, it is almost periodic in the 
sense of Bohr if and only if 
lim g(p") 
k—-20 
exists.’ It is also known that g is almost periodic (B’) if and only if the series 
>> | g(p*) 2 
k=0 p* 
is convergent.” 


6 Cf. A. Wintner, On the asymptotic repartition of the values of real almost periodic func- 
tions, American Journal of Mathematics, vol. 54(1932), pp. 339-345. 

? This criterion is contained in the considerations of Toeplitz, loc. cit., footnote 4. 

8’ E. R. van Kampen and A. Wintner, Theorem II, see footnote 5. Toeplitz (see foot- 
note 4) has stated without proof that this criterion is necessary and sufficient for almost 
periodicity (B?). If this statement were correct, it would follow that the two almost 
periodic classes (W?), (B?) are identical in the present case. But the criterion proved 
above implies that such is not the case. Toeplitz is not responsible for the erroneous 
statement; cf., in fact, R. Schmidt, Die trigonometrische Approximation fiir eine Klasse 
von verallgemeinerten fastperiodischen Funktionen, Mathematische Annalen, vol. 100(1928), 


pp. 334-356, p. 335, footnote 5. 





for 


the 


ries 


UuNC- 


oot- 
nost 
nost 
yved 
20us 
asse 
28), 








ADDITIVE FUNCTIONS AND ALMOST PERIODICITY 117 


It will now be shown that g is almost periodic (W*) whenever M*(g) exists; so 
that (4) is sufficient (and necessary) for the almost periodicity (W*) of a p-function 
g. Since (4) is equivalent to (3), the statement is, in the main, a particular case 
of the theorem italicized in §2 (actually, the statement is not implied by the 
wording of the theorem of §2, since not every p-function is additive); cf. §5 
below. 

In order to prove the sufficiency of (3) for almost periodicity (W’), define a 
sequence of functions g“” = g"(n), g = g®(n), --- by placing 


g’ (n) = g(n) forl Sn Sp’ and g”’ (n+ p’) = g” (n) for every n, 

where g = g(n) is the given p-function and j a positive integer. It is easily 
verified from the definition (2) of a p-function that the function |g’ — g |" of 
nis a p-function. Furthermore, |g” — g | is a bounded function of n, since 


g(n) is supposed to be bounded. Hence, the criterion proved in §3 assures 


the existence of M*(| g — g'’ |*), if this criterion for g is applied to |g — g”? |’. 


Moreover, if g in (12) is replaced by |g — g’’’ |’, 
‘) > | g(r") — g'(p*) |? 
p/ i=0 p* 


But it is clear from the definition of the functions g” of n that the value of 
the infinite series on the right tends to 0 as 7 — ~». Consequently, 


M*(\g — g° |?) = (1 _ 


M*(\g—g” |?) 0 as joe. 


“* (7) « . . 7 . . . . . . 
Since every g” is a periodic function of n, it follows that g is almost periodic 


(W’). 


5. The theorem announced in §2 can now be proved as follows. 

For every additive function f = f(n) and for every prime p, define a function 
Jp = g,(n) by placing 
(13) gp(n) = f(p') if p*|nbut p* +n, (k 2 0). 


It is clear from the definition (2) that g, is a p-function. Furthermore, g,(1) = 
f(1), and so g,(1) = 0, by (1). But it is clear from the definitions (1), (2) that 
a p-function is additive if and only if its value for n = 1 is 0. Hence, every 
Jp is additive. 

Moreover, from (13) and (1), 


(14) - f(n) = Zz Jp(n) for every n, 


where p runs through all primes; it is understood that the infinite series (14) 
has only a finite number of non-vanishing terms for every fixed n. 

Suppose now that the additive function f(n) is bounded. Then, according 
to (13), each of the functions g,(n) of n is bounded and therefore, by §4, almost 
periodic (W*). Hence, in order to prove that f(n) is almost periodic, it is suffi- 
cient to ascertain that, in virtue of the assumption | f(n) | < const., the infinite 








118 PHILIP HARTMAN AND AUREL WINTNER 


series (14) is uniformly convergent for all n (= 1, 2,---). But it is clear 
from (1) and (14) that | f(n) | < const. is equivalent to’ 


(15) > fin sup | g,(n)| < 0, 
Pp 


where the fin sup refers to the least upper bound belonging to fixed p and variable 
n. Since (15) implies the uniform convergence of (14), the proof is complete. 
It is clear from (14) that (15) is equivalent to 


(15’) D fin op [f(p")| < @. 


The merit of the formulation (15’) of the necessary and sufficient condition, 
| f(n) | < const., for the almost periodicity (W*) consists in the fact that the 
data f(p*) occurring in (15’) are independent of one another; f(n) being defined 
by (1). 


6. The result of §3 can now be transferred from a p-function to an additive 
function, as follows. An additive function f(n) has an M*-mean if and only if 
is a bounded function. For if f(n) is additive and bounded, then it is almost 
periodic (W*), and therefore M*(f) exists; while if M*(f) exists, f(n) must be 
bounded, by (i), §1. 

It follows that the mere existence of M*(f) implies the almost periodicity (W’) 
of f(n), tf f(n) ts additive. 


7. It is known” that an additive function f(n) is almost periodic (B’), for a 
fixed \ = 1, if and only if all four series 








p > | f(p*) | > | f(p) |" 


b ] 
Pp \f(p) 21 Pp 


S(p) min (| f(p) |’, 1) 
» p’ ~ Pp 


are convergent. This implies that, if an additive f(n) is almost periodic (B°), 
it need not be almost periodic (B***). On the other hand, an additive f(n) is 
almost periodic (W*) either for every or for nod, where1 S \ < ~. In fact, 
the above proof of the criterion | f(n) | < const. for the almost periodicity (W”) 
of f(n) in the case X = 2 obviously holds for every X. 

The formulation (15’) of | f(n) | < const. now appears of particular interest 
since the content of the condition | f(p")| < const. for the independent data 


® This trivial equivalence is only a manifestation of the statistical independence of the 
terms of the series (14); ef. P. Erdés and A. Wintner, Additive arithmetical functions and 
statistical independence, American Journal of Mathematics, vol. 61(1939), pp. 713-721 
(§12); ef. B. Jessen and A. Wintner, loc. cit., footnote 2 (Theorem 3). 

10 P, Hartman and A. Wintner, On the almost periodicity of additive number-theoretical 
functions, American Journal of Mathematics, vol. 62(1940), pp. 753-758. 








clear 


able 
lete. 


tion, 
the 
ined 


B’), 
.) as 
fact, 
Ww’) 


rest 
data 


f the 
; and 
3-721 


etical 








ADDITIVE FUNCTIONS AND ALMOST PERIODICITY 119 


f(p") becomes evident. In fact, if | f(p*)| < const. for all p and k, the four 
series quoted above are convergent for a fixed \ if and only if the three series 


yr) >» min (| f(p) [', 1) > fe)! 

p Pp P Pp if(p)|21 DP 
are convergent. Since the latter do not involve X, it follows that f is almost 
periodic (B*) either for every \ or for no AX, if | f(p*) | < const. Since all these 
conditions together do not assure that | f(m) | < const., it also is seen that 
an additive f can be almost periodic (B’) for every \ without being almost 
periodic (IW). 


U. S. Army anp THE Jouns Hopkins UNIVERsITY. 








A CORRECTION TO A PREVIOUS PAPER 
By CuHar.es B. Morrey, Jr. 


1. Introduction. In a previous paper,’ the author gave the following result 
(AC, Theorem 8.8): 


A necessary and sufficient condition that the family {z(x)} of functions of class PB, 
on the bounded region G be compact with respect to weak convergence in YB, on G 
is that the following two conditions hold: 

(i) Dy(z, G) is uniformly bounded; 


(ii) there exists a non-negative convex function g(r, +++ ,%n) with the property 
that 
: 1 2 2 
lim r en ye ln) =+., i; = 3 + + T, 
r|—2o 


and such that 


/ giD.,2, -*+, Dz,2] dx 


is uniformly bounded. 


This result, in the generality stated, is false. However, it is possible to replace 
this result by other results which are sufficient for the applications which the 
author makes to the calculus of variations. 

In this note, we shall use the notations and terminology of AC and shall 
assume that the reader is familiar with that paper. For purposes of clarity, 
however, we shall recall the definition of weak convergence in {; and a necessary 
and sufficient condition for compactness of a family with respect to weak con- 
vergence in $,. One of the principal results of AC was that any of the spaces 
%. (elements of which are classes of equivalent functions of class $, on a 
bounded region G) are Banach spaces. Thus weak convergence in {, is already 
defined in terms of that in a Banach space. A necessary and sufficient condi- 
tion that a sequence {z,(x)} converge weakly on G to z in §, is that z, — z 
and D,,z, — D,,z weakly in L, on G (¢ = 1, --- ,n). The following necessary 
and sufficient condition for compactness with respect to weak convergence on a 
general bounded region G has been proved in AC. 


Received October 20, 1941. 

1 Functions of several variables and absolute continuity, II, Duke Mathematical Journal, 
vol. 6(1940), pp. 187-215. Part I of this paper by J. W. Calkin appeared in the same issue 
of this journal, pp. 170-186. We shall hereafter refer to the two parts as one paper and 
shall denote it by the letters AC. 

120 














CORRECTION TO A PREVIOUS PAPER 121 


Lemma (AC, Theorem 8.4). Jf a > 1, a necessary and sufficient condition that 
the family {z(x)} be compact with respect to weak convergence in Bq on a bounded 
region G is that 


(1) D(z, G) = / |z|*dx + D,(z, G), D,fz, G) = / 


bP (Dz, 2] dx 
7 7 i=l 


be uniformly bounded. If a = 1, we must add to the condition (1), the condition 
that the set functions 
(2) gle) = | lz! dz, vile) = / D,,2 | dx 


be untformly absolutely continuous on G. 

The error made in proving the main result (AC, Theorem 8.8) was that no 
mention whatever was made of the set functions g(e). Actually, it was proved 
correctly that the conditions stated were necessary and sufficient for (1) to hold 
and for the set functions y;(e) to be uniformly AC. That (1) and the uniform 
absolute continuity of the ¥;(e) do not imply the uniform absolute continuity 
of the y(e) for an arbitrary region G is seen in the example in §2 below. We 
shall prove in Theorem 1 (§3) below that the result holds if G@ is of class K. 
We shall show in Theorem 2 (§4) that the result holds if weak convergence is 
replaced by a slightly weaker type of convergence. In Theorem 3 (§4) we show 
that this slightly weaker type of convergence coincides with ordinary weak con- 
vergence in the case the functions all vanish on G* in the sense of AC, Defini- 
tion 9.1, and hence in case the functions all coincide, in the sense of AC, Defini- 
tion 9.3, on G* with a particular function 2*(x) of class 2, on G. 


2. An example. Let G be a bounded region in the (x, y)-plane consisting of 
the interior of the unit square 0 S x S$ 1,0 S y S 1, together with the interiors 
of a denumerable number of non-overlapping squares arranged in decreasing 
order of magnitude from left to right above the top of the unit square and each 
connected to it by a narrow rectangle with sides parallel to the axes. Let the 
unit square be denoted by Qo, the other squares by Q; , Q: , --- , and the con- 
necting rectangles by R,, R2,---. Let A, denote the area of Q,, w, the 
width of R, , and h, the height of R, for each p; we assume that the A,, w,, 
and h, all tend to zero and shall specify the rate at which we wish the w, to 


tend to zero later. 
On.G, we define a sequence {z,(x, y)} of functions, each of class $; on G as 
follows: 


0, (x, y)€Qo, 
| (y — 1)/hp Ap, (xz, y)eR,, 
zp(r, y ad, ~] 
| “Ap » (z, y) €Q,, 
| 
(0, (z, yeQ, + Ry, q ~ Pp. 





122 CHARLES B. MORREY, JR. 


Then z,(z, y) 2 0, D.z, = 0, and D,z, = 0 everywhere on G (except at the top 
and bottom of R, where D,z, does not exist) and we have 


[| 2,(z, y)drdy = 1 + a, I/ D,z,dx dy = w,A;’. 
==> 
G G 


If we merely choose w, so that w,-A>' — 0, we see that z,(z, y) tends strongly 
in , to the function z(z, y) = 0 on any region T with T C G. Moreover, 
lien Di(s,, G) = lim If D,2,dz dy = 0 
oe ities G 
so that the derivatives D,z, and D,z, tend strongly to zero in L; on G. 

Now, choose e > 0 and choose P so large that D,(z,, G) < ¢ for p > P. 
Then choose 6 > 0 so small that | y¥1,,(e) | and | ye,,(e) | < ¢ for any e with 
m(e) < 6, p = 1,---,P. From the condition on P and from the fact that 
D.z,, Dyzp 2 0, it follows that if m(e) < 6 then | ¥i,,(e) |, | Yep(e) | < e€ for 
all p. Thus the set functions y,,(e) and yeo,,(e) are uniformly absolutely con- 
tinuous. Furthermore D,(z,, G) = 1 + wpy-A>':(1 + h,/2) which is uni- 
formly bounded. However, since 


[ [i= lardy = 1 
Qp 


for every p and lim A, = 0, it follows that the set functions ¢,(e) are not uni- 


po 
formly AC. 


3. The result for regions of class K. In this section, we prove that our main 
result holds in case G is of class K. To do this, we first prove the following 
more general theorem: 


THeoreM 1. Let {z(x)} be a family of functions defined and of class 3; on a 
bounded region G of class K and suppose that D,(z, G) is uniformly bounded. 
Then the set functions g(e) are uniformly AC on G. 


Proof. Let (T, y, N, T) be a canonical covering of G (see AC, Definition 
7.3), let 


2(x) = p> 2i(x) 


be the corresponding canonical resolution of z(x) (see AC, Definition 7.4), and 
let w,;(y) be the transform of z;(z) under T;, 7 = 1,---,N. In case w,(y) is 
defined only on R» (| y;| < 1,7 = 1, ---,n — 1, —1 < yn S 0) extend it to the 
whole of R; (| y;| < 1) by the equation wi(y, , yn) = wilyn, —Yn). Then 
each w; is of class %; on R; and vanishes on Rt and we evidently have that 








[op 


gly 


ni- 


in 


on 


id 
is 
he 
on 
at 











CORRECTION TO A PREVIOUS PAPER 123 


D,(w; , Ri) is uniformly bounded for each 7 (see AC, Theorem 6.1 and Lemma 
7.4). From AC, Theorem 9.3, it follows that 


[ wily) |dy < yn/"+[m(E)}""-Di(w:i, Ri), — (yar” = m[C(P, r)]) 


for all measurable sets EF in R,. Since each 7; is regular and class K from R,; 








or R2 to T; and since each 2,(z) is zero on and near G — I; , it follows that 
[ \ =a) | de = K-Im(o)I""-Dies, @), 


where K is a constant depending only on G. The result follows by addition. 
Our first result is an immediate consequence of this theorem. 


Coro.tiary. The principal theorem (AC, Theorem 8.8) holds if G is of class K. 


4. The result, using a new type of convergence. It is evident from the result 
of the preceding section that the conditions in our main theorem imply that the 
family {z(x)} is compact with respect to weak convergence on each region T 
with fT CG. This suggests the following definition of ‘‘pseudo-weak conver- 
gence” in 3, on G. 

Derinition. We say that a sequence {z,(x)} converges pseudo-weakly to 2(zx) 
in $B; on G if z,(x) and z(x) are of class , on G and 

(i) Dy(z, , @) is uniformly bounded, 

(ii) the set functions y;,,(e) are uniformly AC on G, and 

(iii) z,(2) tends weakly in $8; to z(x) on each bounded region T with fT C G. 

With this definition in mind we see immediately that the following theorem 
holds. 


THEOREM 2. Our principal result holds if the words “weak convergence’ are 
replaced by ““pseudo-weak convergence’. 

Proof. For, given any sequence {z,(x)} of cur family, we may choose a 
sequence of regions {G,}, each of class K such that G, C Gi4; for each k and 
any closed set F in G is interior to all the G, fork > ky. For each k, we may 
extract a subsequence {2,,x} Of {2 ,z-1} (29 = 2p) which converges weakly in 
%, on G;, to some function z,(x). Now, it is clear that zi4:(7) = z(zx) (essen- 
tially) on G, and that z;,(x) is of class B, on G; with 

Dy(z., Ge) S lim inf Dy(z.x, Gx) 


os 


and so is bounded independently of k. Thus there is a function z(x) of class PB, 
on G which coincides with z, on G, for each k. If we now let {z,(x)} be the 
diagonal sequence, we see that z,(x) tends weakly in 3; to z(z) on each G; and 
hence on each bounded region T with fT CG. This proves the theorem. 

We now compare pseudo-weak convergence with weak convergence in case 
all the functions z,(x) vanish on G*, G being bounded. 





124 CHARLES B. MORREY, JR. 


THEOREM 3. Suppose each function z,(x) of a sequence {z,(x)} vanishes on G* 
in the sense of AC, Definition 9.1, and suppose that {z,(x)} tends pseudo-weakly 
in YB, on the bounded region G to a function z(x). Then |z,(x)} converges weakly 
in 3, on G to z(x) and hence z(x) vanishes on G*. 

Proof. Since {z,(x)} tends pseudo-weakly in 8, on G, we know that the set 
functions y;,,(e) are uniformly absolutely continuous over the whole of G and 
that {z,(x)} tends weakly in $, to z(z) on each cell R with R C G. Thus, 
from §1, it follows that z,(x) and D,,z, tend weakly in L, to z and D,,z on each 
cell R with R in G. Also from AC, Theorem 9.3, it follows that 


/ z,(x)| dx < y,""-[m(e)]""-Dy(z,, G), ecG, 


so that the set functions ¢g,(e) are also uniformly AC on G. But, from a well 
known theorem (AC, Lemma 8.1), this is sufficient for z,(2) and the D,,z, to 
tend weakly in L; on the whole of G to z and the D,,z, respectively. Thus z, 
tends weakly in , to z on the whole of G. That z(x) also vanishes on G* 
follows from AC, Theorem 9.2. 

Coro.tuaRy. Theorem 3 still holds if all the z,(x) coincide in the sense of AC, 
Definition 9.3, on G* with a particular function z*(x) of class B, on G; in this case 
2(x) also coincides on G* with 2*(x). 

Proof. This follows immediately by considering the functions z,(x) — 2*(z) 


and 2(%) — z*(2). 


5. Further changes in AC, part II. 

p. 192, line 4. For G read S. 

p. 199, line 21. Insert “which are open in G” between “yy” and “such”. 

p. 203, second line above Lemma 8.1. For ({1], chapter IV) read ({1], chapter 
IV, pp. 64, 65 and chapter LX, pp. 135-136). 

p. 210, footnote 4, first line. For = read S. 

p. 213, line 16. For ({1], chapter LX) read ({1], chapter IX, Theorem 2). 


UNIVERSITY OF CALIFORNIA. 








Vk 
7 


er 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 


By Saunpers Mac LANE AND O. F. G. ScuHILuIne 


INTRODUCTION 


Kummer theory studies the generation of Abelian fields by radicals over a 
given base field F. This paper will develop a “relative” form of the theory, 
in which the base field F is a field of algebraic functions over a coefficient field. 
The modified theory then considers extensions of this F which are generated by 
radicals, or by arbitrary algebraic extensions of the coefficient field, or both. 
The normal extensions of this type, unlike the ordinary Kummer extensions, 
are in general non-Abelian. 

This development was suggested by an attempt to generalize the principal 
ideal theorem of algebraic number theory. Schmidt, Hasse, and others [17], [8]' 
have shown that the class field theory over algebraic number fields has a strict 
analogue for function fields of one variable over a finite coefficient field. One 
might then surmise that the principal ideal theorem of Hilbert. [2], [7] has a 
similar analogue. We may phrase this conjecture as follows. Given the group 
of divisor classes of degree zero in F, does there exist an unramified Abelian 
extension K of F whose Galois group is isomorphic to the group of divisor classes 
and in which all these divisors become principal? We shall show that this is 
not the case. 

First, we describe more explicitly the behavior of divisor classes which become 
principal in an extension, and show that the behavior of such principal divisors 
can be restated in an elementary fashion, free of arithmetic concepts. 

Consider an algebraic function field F of one variable, over a field § of con- 
stants (in the classical case, § = the complex numbers). The field F has an 
abstract Riemann surface whose points P can be described as prime divisors; 
i.e., as homomorphic mappings of F on §’ plus «, where §’ is an algebraic 
extension of the field § of constants. Each function z of F has a finite number 
of zeros and poles at various prime divisors P;. With the proper multiplicities 
(positive for zeros, negative for poles) these may be listed as a formal product: 


(x) = POP? --- Pe. 


This product is the divisor of z. Any such formal product A = [| P%' is called 
a divisor, though it need not be the divisor of any function z of the field. The 
sum p e:f; is the degree of A, if fi = [§::¥J. The divisors of the form 
A = (x) are called principal divisors; they always have degree zero (number of 
zeros = number of poles). If K is any finite algebraic extension of the given 
function field F, each divisor of F may be construed as a suitable divisor of K, 

Received October 24, 1941; presented to the American Mathematical Society on May 2, 


1941 under the title ‘‘The principal divisor theorem for function fields’. 
1 Numbers in square brackets refer to the bibliography at the end of the paper. 


125 





126 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


so that in addition to divisors principal in F there may be divisors which first 
become principal in K. 

The principal divisor problem is the following. Given a formal list of zeros 
and poles which do not correspond to any function x of F, construct a new 
algebraic function u which will have exactly these zeros and poles. Since the 
new function wu generates an algebraic extension K = F(u), the problem may be 
reformulated thus. Given in F a non-principal divisor A of degree zero, con- 
sider the structure of those algebraic extensions K in which A becomes principal. 

Every divisor A which becomes principal in a finite extension K has a finite order 
in the sense that some integral power A” of the divisor A is principal. To 
prove this it suffices to consider the case when the extension K is normal. In 
case F has a finite characteristic p, inseparable extensions must be included; 
hence, let Ko be the field of all elements of K separable over F with a degree 
n = [Ko:F]. Suppose A = (u) is a divisor principal in K. Then there is a 
power q = p’ of the characteristic such that u‘ is separable over F so that 
A‘ = (u‘) is principal in the separable subfield Ky. If N denotes the norm 
from Ky to F, then NA* = A*" = N(u)* = (Nu‘), where Nu‘ is an element of F. 
Therefore, A*" is principal in F so that the given divisor A does have finite 
order. More specifically, this proves that every divisor principal in a normal 
extension has order dividing the degree (or, the ‘reduced degree’’ in the sense 
of Steinitz) of that extension. 

The set D of all divisors which become principal in a given extension K of F 
is a group under the natural formal multiplication of divisors. This group con- 
tains the subgroup (F) of all principal divisors, and the result just established 
asserts that every element of D/(F) has finite order. The principal divisor 
problem then concerns the structure of those fields K for which the corresponding 
group D contains a given group Dp of divisors. 

The arithmetic notion of a prime divisor (a point on the Riemann surface) 
is not essential to the statement of this problem. By reformulating the problem 
without using this concept, it is possible to treat also the case of function fields 
of any number of variables. 

First, the principal divisors of F can be described directly. Two such divisors 
multiply by the rule (x)(y) = (zy), so that the correspondence x — (x) maps 
the multiplicative group of non-zero functions x of F on the group of principal 
divisors. The functions with divisor (x) = 1 (i.e., with no zeros or poles) are 
simply the constants of §; hence the group of principal divisors may be described 
as the factor group F*/§*, where F* (or §*) denotes the group of all non-zero 
elements of the field F (or §). The group of principal divisors (u) of the ex- 
tended field K may be described in similar fashion; it has as subgroup the group 
of principal divisors (x) of F, provided each principal divisor (x) of F is identified 
with the corresponding principal divisor in K. 

Next, each non-principal divisor A of finite order may be replaced by a formal 
symbol d which will serve all the purposes discussed above, and which will be 
called a “fractional” divisor. Specifically, if (x) is any principal divisor of F 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 127 


which is not a proper power of any other principal divisor, a fractional divisor 
d = (x)"", for m a positive integer, is the generator d of any infinite cyclic 
group {d'} which contains the given principal divisor (x) as the power d” = (z). 
Two fractional divisors d and d’ will be identified if there is an isomorphism 
between the corresponding groups which carries d into d’ and leaves the principal 
divisor (x) fixed. In this sense there is one and only one fractional divisor 
(x)” for given x and m. 

Consider now an (arithmetic) divisor A of order m in F with A” = (zx). It 
can be identified with the fractional divisor d = (x)'"". The divisor A becomes 
principal in an extension K if and only if A = (r), for some r e K; that is, if and 
only if the corresponding fractional divisor d can be identified with the principal 
divisor (r) = d of K. Every problem about divisors which become principal 
can thus be stated in terms of principal divisors and ‘‘fractional’’ divisors; the 
problems become more inclusive to the exact extent that some fractional divisors 
d do not correspond to arithmetic divisors A. 

An element r which makes a divisor d = A of F principal will be called a 
radical over F; that is, r is a radical if (r)" = (x) for some integer m and for 
some xin F. If K is any algebraic extension of F and F({r}) is the subfield of K 
generated by all its radicals, then the divisors of F which become principal in K 
are exactly the divisors which become principal in F({r}). In this sense, our 
problem reduces to the study of “radical” extensions of F which, like F({r}), 
are generated by radicals. Observe, by the way, the simple properties of F({r}) 
relative to K. It may be uniquely characterized as a maximal subfield of K 
which is a radical extension of F. Every subfield of K which is a radical ex- 
tension of F is contained in F({r}). If K is normal over F, so is F({r}). 

The decisive step of replacing the arithmetic divisors by the fractional divisors 
can be applied in other problems, such as the investigations of Deuring [6]. 
His paper studies the structure of Abelian extensions K of a function field F 
of one variable, in the special case when K has the same field of constants as 
does F (i.e., when each element of K algebraic over the original coefficient field § 
necessarily lies in §) and when F contains all n-th roots of unity, where n is 
the degree of K/F and is prime to the characteristic of F. Since such a field K 
is a composite of cyclic fields over F, the elementary results of Galois theory 
in the solution of cyclic equations by radicals show that any such field K is a 
“radical” extension of F, in our sense. All of Deuring’s main results on these 
Abelian extensions can be stated in terms of fractional divisors without using 
Deurifg’s elaborate study of the extensions of prime divisors. His principal 
theorems (Theorems 11, 12, and 13 in [6]) are essentially corollaries of our 
results below. 

Our first chapter treats the algebraic structure of a radical extension. An 
extension of this type may be characterized by a relation between the degree 
of the extension and the order of the group of those divisors which become 
principal in the extension. From this it follows that any subextension of a 
radical extension can also be generated by radicals. The second chapter con- 





128 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


cerns the Galois group of a radical extension. The major theorem states neces- 
sary and sufficient conditions for the realization of an abstractly given group 
as the Galois group of such an extension. To this end, the group is considered 
as an extension of a suitable normal subgroup by the corresponding factor group 
and the conditions for realization are stated in terms of the factor sets describing 
this group extension. The third chapter applies these results to more specific 
cases, such as the classical case of a finite coefficient field and the case of a 
divisor class of prime power order. 


CHAPTER I 
ALGEBRAIC PROPERTIES OF RADICAL EXTENSIONS 
1. Fractional divisors in general function fields. The basis of our study is a 
fixed pair of fields F > §. We call § the field of constants or coefficient field, 


and require that it be algebraically closed in F (i.e., that every element of F 
algebraic over § lie in §). The transcendence degree of F over § is arbitrary; 


for example, F might be a function field F = §(a1,--- ,2n, Yi, °** 5 Ym) Of n 
variables over §, generated by n simultaneous indeterminates x , --- , X, and m 
elements ¥1, °** , Ym algebraic over (21, -°-* , Xn). 


If x is any non-zero element (or “function”) of F, the principal divisor (x) 
is the coset of x modulo the multiplicative subgroup §* of non-zero constants 


of §. Thus, (x) = 1 if and only if x e §*. The group of all principal divisors 
from F is the factor-group F*/}*, and will be denoted more briefly by (F). No 
element’ of this group has finite order, for if (z)" = (#”) = 1, then x” = b 
is a constant of §, so that x is algebraic over § and hence is in § because of the 
hypothesis that § is algebraically closed in F. Therefore, (x)” = 1 implies 
(x) = 1. 


For this reason, the correspondence (x) — (x)” is an isomorphism between the 
group (F) and the subgroup (/)” of all m-th powers of elements of (F). The 
group (F)”, isomorphic to (Ff), can thus be embedded in a larger Abelian group 
(F) in which every element has a unique m-th root. This embedding process 
can be applied to the original group (F) itself, letting m have the successive 
values 2!, 3!,---,m!,---. The limit provides an Abelian group D, > (F) 
with the following properties: 

(i) D, has no elements of finite order. 

(ii) Every element of D, has finite order, modulo (F). 

(iii) Every element of (F) has roots of all orders in D, ; that is, for each (x) 
in (F) and each integer m, there exists d in D,, with d” = (z). 

These three properties uniquely determine the group D, up to isomorphism 
over (F). Furthermore, if D is any other Abelian group containing the base 


2 With the trivial exception of the identity element. ‘ This exception will not be men- 
tioned in subsequent cases like this. 








2ces- 


roup 
ered 
roup 
bing 
cific 
of a 


is a 
field, 
of F 
ary; 
of n 
id m 


* (x) 
ants 
sors 
No 
= 5 
the 
lies 


the 
The 
oup 
cess 
sive 


(F) 


(x) 


‘ism 
ase 


nen- 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 129 


group (F) and having the first two properties, D can be isomorphically em- 
bedded’ in D,, in one and only one way. 

In the classical case where F is an algebraic function field of one variable over 
§, the group D,, consists of all those formal products B = P" - - - P’" of prime divi- 
sors P; with rational exponents e; for which some power B” is a principal divisor. 
Supported by this imperfect analogy, we thus appropriate classical terminology. 
The group D ,, > (F) will be the group of fractional divisors of F/§, or simply the 
group of divisors. The order of a divisor d of D,, is its order modulo (F), that 
is, the order of its coset d(F). This coset is the divisor class of d. A group of 
divisors D is any subgroup of D, which contains the group (F) of principal 
divisors. The exponent e of such a group D is the exponent of the Abelian 
group D/(F); that is, it is the largest order of any divisor class in this group. 

If the field K is an algebraic extension of the base field F, the corresponding 
script letter K will denote the field of constants of K; that is, the field K 
composed of all the elements of K which are algebraic over the original 
coefficient field §. Because of the transitivity of algebraic dependence, the 
field “K is algebraically closed in its function field K. Any function x # 0 
of F has a principal divisor (x) = (x; F) in F and a conceptually distinct principal 
divisor (x; K) in K. But (x; K) = 1 if and oniy if x = b is a constant of XK. 
By virtue of the algebraic closure of § in F, any constant b in both K and F 
must lie in the original constant field §. Therefore, (x; K) = 1 if and only if 
(x; F) = 1; the correspondence (x; F) <> (x; K) maps the principal divisors of 
F isomorphically on certain of the principal divisors of K. We can and will 
identify the divisor (x) = (x; F) with the divisor (x; K), so that (F) becomes 
a subgroup of (K). 

Other principal divisors from K can sometimes be identified with fractional 
divisors from the group D,, of F. Specifically, an element r of K will be called a 
radical if some power (r)” of its principal divisor is a principal divisor of (F). 
The least such exponent m is the order of the radical r, and if (r)” = (x) for 
some function x of F, one must have (r”/xz) = 1, or r”/x a constant b of the 
constant field “K, so that 
(1) r” = bz (ra radical, x e F, b eK). 


The group of principal divisors generated by (r) and (F) is an Abelian group 
without elements of finite order, such that every element has a finite order, 
modulo (F); thus it has an isomorphic replica within the group D ,, of fractional 
divisors of F. Therefore, the principal divisor (7) may be interpreted as a frac- 
tional divisor (r) = d. In this sense, the fractional divisor d of F has become 
principal in the extended field K. Conversely, suppose that G is any group of 
principal divisors (z) of K which has an isomorphic embedding (z) — (z)’ into 
the group D,, , and let d be any divisor which becomes principal in this em- 
bedding, as d = (z)’.. Then by the definition of D,, , d” = (x) for some integer 


’ Here and subsequently isomorphisms of groups containing (F) are assumed to leave 
fixed all the elements of (F). 








130 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


m, so that (z)’" = d” = (x), and so (z”) = (x). The function z of the extended 
field is therefore a radical. If we identify each divisor (r) of a radical with the 
appropriate fractional divisor, the result may be stated as follows. 

THEOREM 1. A fractional divisor d of the field F can be interpreted as a prin- 
cipal divisor of the algebraic extension K > F if and only if d = (r), for some 
radical r of K/F. 

The radicals of K form a multiplicative group, the group R of radicals, which 
contains the subgroup F*K* generated by non-zero elements of F and K. The 
group (R) of all principal divisors of radicals may be identified with a subgroup 
D of D,,; this group consists of all fractional divisors of F which “become 
principal” in K, and we call it the divisor group D = D(K) associated with K. 
The corresponding group D/(F) of divisor classes is isomorphic to the quotient 
group R/F*K*. In sum, a field K with radical R and a divisor group D has 


(2) D=(R), D/(F) = (R)/(F) = R/F*FK*. 


Our object is the study of fields K with a given associated divisor group D. 
This study will reduce to that of “radical extensions’; we say that K is a radical 
extension of F if K can be generated from F by the adjunction to F of some or 
all of the radicals of K. 


2. Pure coefficient extensions. It is essential to break an arbitrary extension 
K > F into two stages K > F(X) > F, where the intermediate field F(X) is* 
generated by the adjunction to F of all the elements of the field K of constants. 
A coefficient extension of F is any extension so generated. The requisite proper- 
ties of these extensions will be stated now; most of them are well known, at 
least for function fields of one variable (see, for example, [6], §1). 

Let YW be any separable algebraic extension of the original coefficient field §, 
and W = F(®) the coefficient extension generated by YW. Because § is alge- 
braically closed in F, an equation for any element of W irreducible over § re- 
mains irreducible over F. If this fact is applied to a primitive element of a 
finite extension YW of #, the degree of W over F is found to be 


(3) [(W:F] = [F(%8):F] = [(W:Fl. 
We now assert that ¥ is the coefficient field of F(Q8). 

Lemma 1. @ is algebraically closed in F() = W. 

Proof. Consider first any element u of W algebraic and separable over B. 
There is then a subfield YW C W finite over § with u in F(QW). Two applica- 
tions of (3) to F(Qo) = F(Wo, u) give 


[Wo(u):F] = [FB , u):F] = [F(Wo):F] = (Wo: F); 


4F(K) issimply the join F U Kof the two subfields Fand°Kof K. Forsuch ajoinwe use 
systematically the “adjunction” notation F(X). 





ca- 


use 











A GENERAL KUMMER. THEORY FOR FUNCTION FIELDS 131 


hence, Wo(u) C Wo, sow eW. Consider next an element u ¢ W inseparable and 
algebraic over YW. The irreducible equation for u over § is then inseparable and 
as before remains irreducible and therefore inseparable over F. This contra- 
dicts the fact that F(%)/F is generated by a separable extension Q/¥. 

Any automorphism S of W/F may be applied to the elements of % contained 
in W and so induces an automorphism o of W/F. If W/¥F is finite and normal, 
consideration of the effects of this automorphism on a primitive element of 
¥/F, which is also primitive for W/F, will prove the following result. By the 
usual directed systems of finite subfields this can be generalized to infinite 
normal extensions. 


Lemma 2. If W/¥§ is normal, sois W/F. The correspondence S — o carrying 
each automorphism of W/F into the induced automorphism o of X&/¥ is an iso- 
morphism of the Galois group of W/F to that of @/§. If W/¥F is infinite, this 
isomorphism is continuous in both directions in the topology of the Galois groups. 


Here and subsequently we use the customary topology for the group of auto- 
morphisms of an infinite extension W > F: each neighborhood of 1 is deter- 
mined by a subfield M C W finite over F and consists of all automorphisms 
leaving M elementwise fixed. 

Finally, we consider the effect of a simultaneous coefficient extension on a 
pair of fields K > F with K = §. 

Lemma 3. If K is a finite extension of F such that § is algebraically closed in 
K while B® is a separable algebraic extension of §, then K(%8)/F(W) is finite of 
degree 
(4) [K (QW): F(QW)] = [K:F]. 

Proof. If ® is finite over §, the equality (4) results at once from two appli- 
cations of the equation (3) to the chains K(Q) D K D F, K(W) D F(W) OF 
joining K(W) to F. If BW is infinite over §, analyze K by the tower of fields, 
F CK, CK, C--- CK, = K, where each K, consists of all elements u of K 
whose p‘-th power is separable over F. 

First, the separable extension Ko is generated from F by a primitive element wu. 
We already know that the polynomial equation for u over F remains irreducible 
over F (Qo) for any finite subfield Wy C YW; hence, it is irreducible over the whole 
field F(Q8) and 
(5) [Ko(Q8): F(W)] = [F(B, u): F(W)] = [u: F] = [Ko: FI. 


Second, the inseparable extension K.4:/K. of degree p" and exponent p can 
be generated as Kes: = K.(21, +--+, 2n), Where the elements af, --- , 2; are 
p-independent in K,. Since a separable extension {8/§ preserves p-independ- 
ence (see, e.g., [10]), these elements a? , --- , x2 remain p-independent in K.(2) 
so that the extension K.4;(%) = K.(%, 11, --- , Zn) still has the degree p” over 
K.(%) or 


[Ke+i(@):K-(®)] = [Key K] = p" (n = m). 


% 





132 SAUNDERS MAC LANE AND 0, F. G. SCHILLING 


Combined with (5), this result for e = 0, 1, 2,---, m gives the desired 
conclusion (4). 

In these lemmas the requirement that YW be separable over § is not a captious 
restriction. For example, consider the field B(x, y, z) generated by three in- 
dependent indeterminates x, y, z over a perfect field $ of characteristic p, and 
let § = Biv’, yy’), F = B(x’, y’, z, u) = F(z, u), where u = yz + x. To show 
that § is actually algebraically closed in F, observe first that F is obtained 
from the pure transcendental extension §(z) by the adjunction of a p-th root 


> p rn l/p +: . > ~ ~ 
u = (y’2”? + 2)"”. If weF is algebraic over §, u” € §(z); hence, u” e §, by 
the nature of a transcendental extension, so that ue B(x, y). Thus wu lies in 
the intersection of B(x, y, 2”) and B(x’, y”, z, u) = F. Ina study of certain 


non-modular lattices, it was shown [9] that this intersection is B(x”, y”, 2”). 


But u e B(x’, y”, 2”) and u algebraic over B(x’, y”) give u in B(x”, y”) so that F 
is indeed algebraically closed in F. 

This field § has a purely inseparable extension W= f(y) of degree p (y” € §). 
The corresponding coefficient extension is W = F(y) = B(x’, y’, z, yz + 2, y), 


and contains an element x = yz + x — y-z algebraic over § but not in B, so 
Lemma 1 fails. The equation (3) above will fail if one uses the extension 
W = Fly, x) of degree p. One may also show that Lemma 3 would break 


down in a similar case. 

At this point we may mention a somewhat more general type of radical ex- 
tensions, in which the allowable coefficient extensions are restricted more 
sharply. Let any given field F be embedded in an algebraically complete’ 
field A, and let € be a specified subfield of A with the following properties: 

(i) All roots of unity from A lie in G; 

(ii) © is normal and algebraic over F N GC. 

(For example, € might simply be the subfield of A generated by all roots of 
unity.) 

Coefficient extensions are now those extensions generated by subfields of C. 
Thus, we let § denote the intersection F NM ©, while the field of constants of 
any algebraic extension K of F is to be found by embedding K in A, then tak- 
ing the intersection K = K 1 ©. Because of condition (ii), this intersection is 
independent of the way in which K may be embedded in A. With such co- 
efficient extensions and with fields Y which are subfields of ©, all the lemmas 
of this section are still true, provided the condition “8 algebraically closed in 
K” be replaced by the condition “KN € = BW”. 

Relative to this field ©, divisors and radical extensions can still be defined. 
By assumption (i) on ©, the group (F) = F*/§* of principal divisors of F still 
is an Abelian group without elements of finite order, contained in the larger 
group K*/*K* of principal divisors of K. Under this interpretation of the divi- 
sors, the whole subsequent theory of radical extensions will go through without 
change as a theory of extensions relative to ©. In case € is algebraically com- 


5 Algebraically complete = has no proper algebraic extension. We use this term in place 
of the older and less convenient one, ‘‘algebraically closed’’. 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 133 


plete, this gives simply the previous case. If € is not algebraically complete but 
if © /N F is algebraically closed in F, the theory relative to € is sharper than 
the standard one in the sense that any radical extension of F relative to € is 
always a radical extension in the standard sense, but not necessarily conversely. 


3. The inseparable cases. Separable and inseparable radical extensions can 
be sharply distinguished. 

THEOREM 2. [f the field F has the finite characteristic p, then every divisor of 
F principal in a separable extension of F has order relatively prime to p and every 
divisor principal in a purely inseparable extension of F has order a power of p. 
Conversely, if d is any divisor of F principal in some extension K > F, then, if 
d has order prime to p it is principal in some subextension of K separable over 
F(K), while if d has as order some power of p, it is principal in some subextension 
of K which is purely inseparable over F(K). 


Proof. Any divisor d = (r) principal in K is principal in the extension of 
F() generated by a root r of an equation r” = ra, irreducible over F(X). 
This equation is separable or purely inseparable according as the order m of d 
is prime to p or is a power of p. From this fact the various statements of the 
theorem follow. 

This result means that the study of radical extensions in which the associated 
group D of divisors is a p-group (a group in which every element has order some 
power of p) is coextensive with the study of purely inseparable extensions. 
Specifically, an extension K with coefficient field XK is a radical extension of F 
with a divisor group which is a p-group if and only if K is a purely inseparable 
extension of F(K). For example, if F is a function field of one variable over a 
perfect coefficient field §, then F has one and only one purely inseparable ex- 
tension of degree p‘, and in this extension all divisors of order p*, and no others, 
become principal. 

Henceforth, we omit these anomalous cases and treat only separable extensions 
K of F and divisor groups D in which the order of every element is prime to the 
characteristic p. 


4. Crossed characters. The ideals which become principal in a normal exten- 
sion of an algebraic number field Fy give rise to certain functions defined on the 
Galois group of the extension. Specifically, if an ideal a of Fo is a principal ideal 
a = (B) in some normal extension Ko of Fy) , then for each automorphism S of 
Ky over Fy, the ideals (B) and (B*) are equal so that each quotient B*/B is a unit 
E; of Ky. These units satisfy the functional equation Es(E7)* = Esr ; hence 
they have been called “crossed characters’. The properties of these functions 
have been studied by several authors;* various complications arise because the 
functions involve the explicit structure of the group of all units of the number 
field Ky. In the analogous situation for an algebraic function field, the units are 


6 See the references given for Theorem 10.3 in [11]. Note that we write (B7)S = BS?. 


ai 








134 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


replaced by the non-zero constants of the base field. Since these constants 
themselves constitute a field, the group structure should be more amenable to 
treatment. This gave the starting point of the present investigations. 

To set up the corresponding functions for a general function field F over §, 
let G denote the group of automorphisms u — u* of a separable normal extension 
K/F and observe that each automorphism S ¢ G induces an automorphism b — b* 
of the subfield XK of all constants (although distinct automorphism S, 7 may 
induce the same automorphism of *K). A crossed character of G in % is a func- 
tion which assigns to each S in G a non-zero constant cs in K in such fashion 
that 
(6) es(er)® = esr (for all S, T in G). 


The term-by-term product cscs of two given crossed characters cs and cy is 
again a crossed character, and the set of all crossed characters is a group {cs}. 
If b ¥ Ois any fixed element of *K, the special function cs = b/b* = b'* always 
satisfies (6); it is called a unit character. 

Unless K = F(X) is a pure coefficient extension, the Galois group of K/¥ is 
a proper homomorphic image of G. However, for the special case of a pure 
coefficient extension, Noether’s principal genus theorem “in minimalen’’ [13], 
[18] may be phrased thus: If K is a finite separable normal extension of § with 
Galois group T, every crossed character of T in K is a unit character. This result 
will be used repeatedly and is an essential tool for our investigations. 

In general, in the Abelian group of all crossed characters, the cosets modulo 
the subgroup of unit characters are the so-called classes of associate crossed 
characters, and the corresponding factor group {cs}/{b' *} is thus the group of 
classes of crossed characters. 

THeoreM 3. If K, a finite separable normal extension of F, has a radical R 
and a Galois group G, then the group (R)/(F) of divisor classes which become 
principal in K is isomorphic to the group of classes of crossed characters of G in 
the field K of constants. 

Proof. Each automorphism S of K/F may be regarded as an automorphism 
(u) — (u*) = (u)* of the group of principal divisors of K. By definition, any 
radical r of K has (r)” = (zx) for a suitable integer m and a suitable function x 
of F. For any S, (r*)”" = (r")* = (x)* = (x) = (r)”, while (r*)” = (r)” implies 
that (r°) = (r), for there are no divisors of finite order. Therefore, the radicals 
r of K are included among those elements z of K for which (z) = (z*) for every 
automorphism S. 

Conversely, consider any z with (z) = (2°), and let y = IIsz* be the norm of 
zfrom K toF. If K has the degree m over F, the divisor (y) is then IIs(z*) = (z)”. 
Therefore, z must be a radical of K. This fact we record as 

Corotiary 1. Ina finite separable normal extension of F the radicals are the 
elements z for which (2°) = (z) for every automorphism S. 


Return to the proof of the theorem and consider a radical z. Since the divisor 
of a function is 1 only when the function is a constant, each such z determines 














A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 135 


a set of constants cs ¢K with z* = csz. From this one may show formally 
that the function cs satisfies the equation (6); hence it is a crossed character. 
Conversely, any given crossed character cs in K may be regarded as a crossed 
character in the whole field K. Noether’s principal genus theorem quoted above 
applies to K and asserts that cs has the form cs = z/z* for some element z * 0 
of K. Then (z) = (esz*) = (z*); hence, z is a radical. Therefore, the corres- 
pondence z — ¢s carries the group R homomorphically onto the whole group of 
classes of crossed characters. One may verify that in this correspondence cs 
is a unit character if and only if z is in the group F**K*, generated by constants 
and by functions of F. Therefore, the group of classes of crossed characters 
is isomorphic to the factor group R/F*K*. But this group, by (2), is iso- 
morphic to the group D/(F) of those divisor classes which become principal in 
K. This proves the theorem. 

The result of the theorem may be applied in particular to a pure coefficient 
extension; the result is’ 

Coro.iary 2. In an extension W = F(8) generated by a separable coefficient 
extension, no non-principal divisor of F becomes principal. 

Proof. If B is not already normal, embed it in a separable field normal over 
i. If there is any change, more divisors would become principal in the en- 
larged field 1V; hence it suffices to prove the corollary in the case when QW is 
normal. By Lemma 2, the Galois group G of W/¥ is then effectively identical 
with that of W/F so that the principal genus theorem asserts that every crossed 
character of G in W is a unit character or that the group of classes of crossed 
characters has but one class. The theorem now shows that no divisors become 
principal. 

We now turn to the other extreme case, where K = §. 

Coro.iary 3. If K is a finite separable normal extension with Galois group 
G over F and if the coefficient field § is algebraically closed in K, then the number of 
divisor classes of F principal in K is finite and is a divisor of the number of charac- 
ters of G which can be realized in the coefficient field §. If § has a finite charac- 
teristic p, these numbers are both prime to p. 


Proof. In this case, the new coefficient field K is just § again and every 
automorphism of K acts as the identity on §. By the definition (6), every 
crossed character is just an ordinary character, that is, a function c(S) with 
e(S)c(T) = c( ST). The number of such functions is finite and their values 
are riecessarily roots of unity. The number of principal divisor classes is 
bounded by the number of such functions with values in “K; this clearly depends 
on the presence or absence of suitable roots of unity in K; an explicit formula 
could be derived. In any event, a field § of characteristic p contains no p-th 
roots of unity other than 1; hence, it contains no characters of order p, as as- 
serted above. 


7A special case of this result was found by one of us (Schilling) in 1935; it was com- 
municated to H. Reichardt, who generalized it to any function field of one variable. 








136 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


To obtain similar limitations on the number of divisors which become prin- 
cipal in non-normal extension, we need the following observation on the effect 
of a pure coefficient extension upon the radical. 


Lemma 4. If an extension K/F has a radical R, while a separable extension §' 
of & gives rise to an extension K' = K(§’) of F’ = F(§’) with radical R’, then 
(R)/(F) is isomorphic to a subgroup of (R’)/(F’). 

Proof. The groups (R), (F), and (F’) are all subgroups of the group (R’). 
Any element in the intersection (R) M (F’) has the form (x’), where x’ ¢ F’ and 
where (x’)” is in (F), for some m. This means that 2’ is a radical of F’ relative 
to F; since F’ is a pure coefficient extension, Corollary 2 above shows that (2’) 
is in (F). Hence (R) NM (F’) = (F). Each coset of (R)/(F) is then contained 
in one and only one coset of [(R) U (F’)]/(F’). Since (R) U (F’) C (R’), this 
coset correspondence is the required isomorphism of (R)/(F) to a subgroup 
of (R’)/(F’). 

We can now state the basic theorem limiting the size of the radical. 

THEoreEM 4. A finite separable extension K > F with radical R has the order 
of its divisor group (R) bounded by 


(7) ((R):(F)] Ss [K:F(XK))}. 
In this theorem, the replacement of F by F’ = F(%), K by K’ = K(K) =K 
and R by R’ will not change the right-hand of (7) and will not decrease the 


left-hand of (7), because of Lemma 4. It thus suffices to prove the theorem 
in the case when K = §. 

If § = “K, the proof can be given by a two-step replacement of the given field 
K. Let L = F(R) be the subfield generated by all radicals of K; it is a radical 
extension, with a radical Q which contains R. It is generated by roots r of 
various separable binomial equations r” = za. To enforce normality, let }’ be 
the (separable) extension of § generated by all m-th roots of unity for every m 


1 or 


occurring in such a binomial equation. The modified field L’ = F(¥’, R) will 


then be normal over F’ = F(§’). Because K = §, § is algebraically closed 
in L; therefore, Lemma 1 proves §’ algebraically closed in L’ = L(¥’). Let 


Q’ be the radical of L’/F’. One then has the following table: 


Given Extension Radical Extension Normal Extension 


Top Field K > F(R =L C LiF) =L' 

Base Field F = F C F(§’) = F’ 

Coef. Field A= fF = Ry S 5’ 

Radical R Se Q S Q’ 
FIGURE 1 


The problem (see Figure 1) essentially reduces to the study of the normal 
extension L’ of F’, where the coefficient field of F’ is algebraically closed in L’. 
By Corollary 3, the index [(Q’):(F’)] for its radical is bounded by the number 





der 


al 
j , 
er 





A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 137 b 


of characters of the Galois group which can be realized in §’. The number of 
characters is at most the order of the group, so this index is at most the degree 
of the field. Therefore, [(Q’):(F’)]| s [L’:F’]. Tracing back the radicals 


through the construction used, we have, using Lemma 4, 
((R):(F)] Ss [(Q):(F)] S [(Q):(F)], 
while a similar description of the degrees, using Lemma 3, gives 
[L’:F’] = (L(§’): F(§)] = [L:F] = [F(R):F] s [K:F]. 

These inequalities combine to prove [(R):(F)] S [K:F], where F = F(X), as 
required by (7). 

Coro.titary. Jf K is a normal separable extension of F with constant field 
K = § and if a divisor D of order m becomes principal in K, then all m-th roots 
of unity lie in &. 

Proof. Let D = (r) for rin K. Then r satisfies an equation r” = 2, for 
some xin F. By Theorem 4, the field F(7) has degree at least m over F; hence 
this equation for r is irreducible over F. Since K is normal, the conjugates 
of r all lie in K; they are ¢‘r for ¢ a primitive m-th root of unity. Hence K = § 
means that ¢ lies in J, as asserted. 


5. Properties of radical extensions. The radical extensions can be com- 
pletely characterized by the order [(R):(F)] of the associated divisor group 
D = (R). 

THEOREM 5. A finite separable extension K/F with radical R and with coeffi- 
cient field K is a radical extension of F if and only if 


(8) ((R):(F)] = (K: F(CK)]. 

Proof. Suppose first that K is a radical extension so that it is generated by R. 
The associated group (R)/(F) of divisor classes is then a finite Abelian group so 
that it may be represented as a direct product of cyclic groups of orders 
m,,-::,m,. Let the generators of these cycles be the divisors 


(9) d, = (nr), °°:, d, = (ra), order of r; = m;. 


The exponent of the group (R)/(F) is then 


(10) i, e = l. ec. m. (m, --- , ma) 

and the order is the product m; --- m,. Each generating radical r; satisfies an 

equation of the form 

(11) r;* = ba; (b; in K, 2; in F). 
The radicals r;, --- ,7,, together with the elements of F and , generate 


the whole group R of radicals; hence the radical extension K is generated by 











138 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


them as K = F(K,7,---,1). In view of the equations (11) for these gen- 
erating elements, the total degree of this field is at most [K:F()] S m, --- m, = 
[(2):(F)]. Combined with the reverse inequality as given by (7), this proves 
the desired result (8). 

Incidentally, this argument shows also that each of the equations (i1) of 
degree m; for a radical r; of order m; is an irreducible equation over F(K). This 
fact could also be established directly, without appeal to Theorem 4. 

Conversely, suppose that the relation (8) holds for the radical R of an exten- 
sion K. The subfield F(R) generated by this radical is then a radical extension 
which has the same radical R and the same coefficient field K as does K. There- 
fore (8) applies to this extension F(R) and gives [(R):(F)] = [F(R):F(K)]. By 
assumption, [(?):(F)] = [K:F(K)]. The subfield F(R) has thus the same 
degree as the whole field so that F(R) = K and K is indeed a radical extension. 

As a consequence of this theorem, we can derive a result subsequently useful 
in recognizing the radicals of certain given extensions. 

Coro.iary. If K is a separable extension of F generated from F by the adjunc- 
tion of a multiplicative group Ry D F* which consists of radicals, then Ro-K* is the 
radical of K and (Ro) is the group D of all divisors of F principal in K. 

Proof. In any event the group Ry-K* is contained in the radical R. Suppose 
first that A/F is finite. The finite group ())/(F) may then be generated 
exactly as in (9)-(11). The argument used to prove the theorem then shows 
that [K:F(K)] = [F(K, Ro): F(K)] S [(Ro):(F)]. But (8) then gives 

[((R):(F)|] = [K:F(K)] S [(Ro):(F)]. 
This proves that (R) = (RR) and hence that R C RoK*. The group D of 
principal divisors is D = (R) = (ReK*) = (Ro), as asserted. 

If K is infinite over F, the assertions of the corollary may be proved by 
applying the results of the finite case to a directed system of subgroups Ra © Ro 
of finite order over F*. 

Next, we consider the properties of subfields of a radical extension. 

TueoreM 6. If K is a separable radical extension of F, then any subfield K’' 
of K containing F is also a radical extension. 

For this proof, we denote the radical of any field K relative to a subfield F 
by R[|K/F]. We observe at once that the radical of K’ is given as the inter- 
section 
(12) R[K'/F] = R[K/F|N K"*. 

On the other hand, the composite group R[K/F]-K’* contains only radicals of K 
over K’. Since R[K/F] by itself generates the whole field K over F, this larger 
group R[K/F|-K’* generates K over K’. By the corollary above, this group 
is thus the whole radical of K over K’ so that* 

(13) R(K/K’'| = R[K/F]-K". 


8 Incidentally, this formula generalizes Theorem 14 of [6]. 





to 


gen- 


i, = 
oves 


) of 


lhis 


ten- 
sion 
ere- 

By 
ame 
ion. 


eful 


inc- 
: the 


ose 
ited 
OWS 


) of 


ger 
up 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 139 


By one of the isomorphism theorems for Abelian groups, the two assertions (12) 
and (13) combine to give the isomorphism 


(14) R{K/F\/R[K'/F] = R{[K/K’'\/K". 


Consider now the special case when K’ contains the whole field K of coeffi- 
cients. These relations will then still hold, if in the formulas each group Ro 
of radicals is replaced by the corresponding group (Ro) of principal divisors. 
For example, in (12), a divisor common to the groups (R[K/F]) and (K’*) has 
the form d = (r) = (u), for r a radical of K and win K’. Then u = ra, for 
some ae.K. Since K’ D %K, u also will be a radical of K’ so that d = (u) isin 
the group (R[K’/F]). The rest of (12) and (13) are easily established in this 
ease, thus giving also the analogue of (14), for K’ D *K, as 


(14a) (RK /F))/(R[K'/F]) & (R[K/K’))/(K). 


Return now to the proof of the theorem in the case when K is finite over F. 
Consider first the case of an intermediate field K’ with K’ > XK and suppose, 
contrary to the result of the theorem, that K’ is not a radical extension of F. 
Thus, Theorems 4 and 5 give 


[(R[K'/F)}):(F)] < [K’:F(X)]. 
Since K still is a radical extension of K’, Theorem 5 also gives 
((R[K/K’]):(K’)] = [K:K’]. 
By (14a), the term on the left here may be replaced by the index 
((R[K/F]):(R[K’/F))}. 
The two results then combine to give the inequality 
[(R[K/F}):(F)] < [K:K']-[K’:F(K)] = [K:F(X)]. 


By Theorem 5, this contradicts the assumption that K is a radical extension of F. 
Therefore K’/F is a radical extension. 

Now consider the general case of a field K’ not containing the whole constant 
field . It has an extension L = K’(%) which is a radical extension of F, by 
the case already treated. Since this field is a coefficient extension of K’, 
Corollary 2 of Theorem 3 asserts that every divisor principal in L is already 
principal in K’ so that (R[K’/F]) = (R[L/F]) and 


[(R[K’/F]}):(F)] = [((RIL/F)):(F)] = [L:F(X)). 


On the other hand, let K’ be the field of coefficients of K’, and apply Lemma 3 
to the finite extension K’ of F(X’). It proves that 


[L:F(K)] = [(K'(K):F(K)] = [K’: F(K’)]. 





140 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


These two results combine to give the equality 
((R[K’/F]):(F)] = [K': F(X’)]. 


By Theorem 5, this suffices to prove K’ a radical extension. 

To complete the proof of Theorem 6, we need only treat the case of a sub- 
field K’ of an infinite extension K. Every element u of K’ is contained in a sub- 
field of K generated by a finite number of radicals of K; to this subfield the 
preceding argument applies to show that the given field K’ is indeed a radical 
extension of F. 


Coro.uary 1. Let D and D' be respectively the groups of divisors of F principal 
in K and K’. The correspondence K' — D'/(F) maps the lattice of all those fields 
K’' which contain K isomorphically on the lattice of all subgroups of the group 
D/(F) of divisor classes. 


Proof. It is clear that the correspondence K’ — D’ is one which preserves 
inclusion (K’ = K” implies D’ = D”’). To obtain the lattice isomorphism, it 
then suffices to show that there is one and only one field corresponding to a 
given group D’ of divisors with D > D’ D (F). For given D’, let Ry be the 
set of all those radicals r of K for which the principal divisor (r) lies in D’ and 
set Ko = F(R,). Then Rp is a group which contains F* and K* and (Ry) = D’ 
so that the corollary of Theorem 5 shows that Ko is a radical extension with 
Ky, — D’. Furthermore, if K’ is any other field with K’ — D’, this field must 
contain some one radical r corresponding to each divisor of D’ so that it must 
contain all radicals Ry used to construct Ko , whence K’ = Ky , and Kp» is unique. 
This completes the proof. 


Coro.uary 2. Let R and R’ be respectively the radicals of K and K’. The 
correspondence K' — R’ maps the lattice of all intermediate fields K’ isomorphically 
on the lattice of those subgroups R’ of the whole group R which satisfy the conditions 
(i) R’ D F*; (ii) R’ N K* is closed under subtraction of distinct elements. 


Proof. Since each K’ is a radical extension, it is determined by its radical R’. 
The correspondence K’ — R’ is thus one-one; since it preserves inclusion, it is a 
lattice isomorphism. It remains only to prove that every group R’ satisfying 
conditions (i) and (ii) is the radical of some intermediate field. Since R’ con- 
tains —1, it contains —a with a; hence the assumption (ii) will prove R’ NM K* 
is (except for the absence of 0) a subfield 2 of K. The field K’ = F(R’) is then 
a radical extension of F and its radical may be computed, by the corollary to 
Theorem 5, to be just R’K’*, where *K’ is the field of constants from K’. If 
we can prove K’ = %, we shall have proved that the given group R’ is the radical 
of a field K’. 

Suppose instead that K’ > &. Then F(X’) > F(2) so that Theorem 5 gives 


[K’:F(2)] > [K’:F(K")] = [(R’):(P)]. 








Ww 





4 sub- 
2 sub- 
d the 
adical 


ncrpal 
fields 
group 


erves 
m, it 
to a 
e the 
’ and 
= P’ 
with 
must 
must 
ique. 


The 
cally 
tions 


1 R’. 
;is a 
ying 
con- 
| K* 
then 
Vv to 

If 
lical 


rives 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 141 


Now the correspondence r — (r) maps the Abelian group R’/F*%* isomorphically 
on (R’)/(F). As in (9), one may choose a basis (7), --- , (r,) for the latter 
Abelian group; the group R’ is then generated over F*%* by the representatives 
ri, °°: ,7, With r?* = b,x;, where each 6; is in & and each z; in F, while 
mym, --- m, = [(R’):(F)]. The field K’ = F(R, R’) = F(R, 1, +++ , rn) then 


has a degree at most 
[K’: F(Q] S myme --+ m = [(R’):(F)]. 


This contradicts the previous inequality and hence the assumption XK’ > &. 

A radical extension is not uniquely determined up to isomorphism by its 
divisor group and its field K of coefficients, as one may readily show by examples. 
Instead, one has the following modified uniqueness theorem, in which, for 
simplicity, all extensions of F are supposed to be within a fixed algebraically 
complete extension of F. 

THEorEM 7. If K and K’' are separable radical extensions of F with the same 
group D of associated divisors, then there exists an algebraic extension % of the 
field of coefficients such that K(%) = K'(W). Incase K is finite over F, one may 
also require that YW be finite over §. 

Proof. Suppose first that K/F is finite, and generate K as K = 
F(K, m1, +++ ,1n) by radicals r;, as in (9), (10), and (11). With the same 
generation of the group of divisors, the second radical extension K’ will have a 
generation K’ = F(, r;, ++, 1s) in which the r; satisfy equations r;"* = 
b.z;, of the form of equations (11). The field Q& obtained from § by the 
adjunction of °K, 2K”, and all the roots of the equations ¢7"‘ = b;/b; then has the 
desired properties. The case of an infinite field K is treated by the same argu- 
ment applied to the finite subfields of K. 


6. The group of radicals. The ordinary Kummer theory studies extensions 
K of a field F which can be generated by the adjunction of n-th roots to F 
(n fixed and prime to the characteristic). On the assumption that all n-th roots 
of unity lie in F, each such Kummer field K is uniquely determined by a certain 
multiplicative group in F; namely, the group of all those elements of F which 
have n-th roots in K (see, e.g., [20], [3]). In similar vein, the class field theory 
determines all Abelian extensions of number fields and of certain function fields 
uniquely in terms of suitable multiplicative groups. If one seeks for a simple 
multiplicative group which is related in similar manner to the radical extensions 
studied here, the answer is to be found, not in the group of divisors (Theorem 7), 
but in the group R of all radicals. 

For a given separable extension K, the group R of radicals is an Abelian group 
with the following properties: 

(i) R contains the multiplicative groups F* and K* of the fields F and XK, 
and thus also the group composite F*K*; 
(ii) R/%K* has no elements of finite order; 





142 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


(iii) R/F*K* has every element of a finite order, prime to the characteristic 

of F. 
Abstractly speaking, R is not merely a multiplicative group; it is rather a group 
of which certain portions are the multiplicative groups of fields. (This corre- 
sponds to the fact that the radical extensions include all possible coefficient 
extensions so that any tool adequate to describe all radical extensions must 
describe all algebraic extensions of the constant field §.) However, instead of 
treating R postulationally, we now show why the structure of a group R with 
these properties does determine the structure of K. 

TueoreM 8. Let K and K’ be separable radical extensions of F with radicals 
R and R' which are isomorphic under a correspondence r <> r® which is the identity 
on the subgroup F* and which carries the coefficient field K isomorphically onto XK’. 
Then @ may be extended in one and only one way to an isomorphism of K to K’. 


Observe that the hypotheses placed on the correspondence ¢ in this theorem 
refer only to the structure of R relative to the subsystems F*, K*, as sum- 
marized above. 

Proof. Assume first that (R)/(F*) is finite. By the assumption on the coeffi- 
cient field, the given isomorphism ¢ can be extended to an isomorphism of F(X) 
to F(K"’). The field K’ can be generated from F(K) by the successive adjunction 
of the roots r; of the irreducible equations ¢7' = b,x; , as in (11). These equa- 
tions are purely multiplicative, so the corresponding generating radicals r; of K' 
will satisfy corresponding irreducible separable equations ¢7* = b?x;. Repeated 
applications of a basic extension theorem of the Galois theory then give one 
and only one isomorphism which extends ¢ and maps r; on r;. This completes 
the finite case. 

The infinite case is treated by the usual approximation devices. For each 
subgroup Ry C R with a finite order over F*K*, one has a unique extension of ¢ 
to an isomorphism of F(R») to F(Ro). Because of the uniqueness of the exten- 
sion, two such extensions agree wherever their fields overlap. They may then 
be combined to give a single extension to all of K. 


Coro.uary. The group of all automorphisms T of a separable radical exten- 
sion K/F with radical R is the group of all those (multiplicative) automorphisms 
r <> r” of the group R which have the following properties: (i) x” = x if xe F*; 
(ii) a” « K* if and only if a eK*; (iii) (a + b)” =a" + b’, provided a, b, and 
a + bare all in K*. 


Proof. Every field automorphism induces such a 7’; conversely, each such T 
will by the theorem give an automorphism of K, provided T is an automorphism 
of the field K. This follows from the conditions (ii) and (iii), with the observa- 
tion that (—a)” = [(—l)a]” = (—1l)a”’ = —a’, so that (iii) holds even if 
a+b=0. 

A companion to the above uniqueness theorem is the following existence 
theorem. 





yn =e hes 





ristic 


group 
corre- 
icient 
must 
ad of 

with 


dicals 
entity 
0K". 
os 

orem 
sum- 


oeffi- 
*(K) 
ction 
»qua- 
of K’ 
sated 

one 
letes 


each 
. of 
xten- 
then 


xten- 
isms 
e F*; 
and 


ch T 
hism 
rva- 
on if 


ence 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 143 


THEeorEM 9. If K is any separable algebraic extension of the coefficient field § 
of Fand if RD K* isan Abelian group with the properties (i), (ii), (iii) above (pp. 
142-143), then there exists one (and essentially only one) separable radical exten- 
sion K of F which has K as field of coefficients and R as radical. 


Observe that we assume FR given with a specified subgroup K*; it would be 
a priori possible that more than one subgroup of a given R could be interpreted 
as the multiplicative group of a field. The theorem itself is essentially just an 
expression of the fact that the equations (11) which define a radical extension 
over its coefficient extension are properly multiplicative in form so they are 
determined by the group structure of R. 

Proof. Since every element of R/F*K* has finite order, it will suffice, as in 
previous cases, to suppose that R/F*K* is finite. Because of assumptions (ii) 
and (iii), the group R/K* = (R) can be identified with a group D of fractional 
divisors of F. Let (r) denote the divisor belonging to the element re R. Select 
a basis 
(15) d, = (ri), °°: , d, = (rn), order of r; = mj, 


for the finite Abelian group R/F*K*. Then ¢ = r; must satisfy a separable 
equation of the form 
(16) ‘“* = xb; , by € K, a eF. 


Using any formal roots r, of these h equations over F(K), one can now con- 
struct a field K’ as K’ = F(X, ri, ---,17,). By means of the previously estab- 
lished properties of radical extensions, one then sees that this field is indeed a 
radical extension, that its divisor group D’ is exactly the previous group D, 
that its radical R’ is isomorphic to the given radical R under a correspondence 
r; > r, and that its field of coefficients is just K. We have thus found a field 
with a given radical R; the uniqueness of the construction is asserted by the 
previous theorem. 

As another illustration of the adequacy of the multiplicative description of R 
by (i), (ii), and (iii), consider three fields F C L C K. Given the radicals 
R{K/L] and R[{L/F] as groups, the radical R[K/F] is completely determined as 
the set of all those elements r in R[K/L] for which some power r” is in the 
subgroup F*:K* of R[K/L). In terms of the subgroup L* of R[K/L] one may 
state 


TueoreM 10. Jf K is a separable radical extension of L and L a separable 
radical extension of F, then K is a radical extension of F if and only if R{K/L) 
is the group composite R{|K/F)-L*. 

Proof. The necessity of this condition was established in (13) of §5. To 
prove the sufficiency, observe that R[K/F] > R[L/F] so that the adjunction 
to F of the radical R[K/F] will first generate the whole field L, then the whole 
radical R[K/F]-L* = R[K/L), and therefore the whole of the radical exten- 
sion K/L. 





144 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


Arithmetically, one of the most important types of function fields is that in 
which the coefficient field § is a finite field, say with g = p” elements. In this 
case, the description of the group R can be substantially simplified because the 
part of R to be identified with the coefficient extension *K can be described in 
purely group-theoretic terms. Indeed, a finite extension K of degree f over § 
will have ¢ elements and its multiplicative group is cyclic of order q¢ — 1. 

Conversely, let § be a cyclic multiplicative group of order g/ — 1 with 
§* as a subgroup. We assert that § can be mapped isomorphically onto the 
multiplicative group K* of the finite extension K of degree f over § in such 
manner that the elements of §* are all left fixed in the mapping (without this 
last proviso, the assertion would be trivially true). For, let w be any element 
generating the multiplicative group K*, so that 


w =f, g = (¢ — 1)/(q— 0), 


is an element generating the multiplicative group §*. If ¢ is any generator of 
the given group ©, then 


f= f', (i,gq-1) =1, 

Any other generator o’ of would have the form o’ = o”’, for some integer y 

: f : _ 
prime to g — 1 and for this generator (o’)® = o” = ¢". Leth, ---,l be the 

. ‘ f ° = oun 
various prime factors of ¢ — 1 which are not divisors of g — 1. Then the 
congruences 

iy =1 (modq-— 1), y=1 (mod4), ree, y =1 (mod l,) 
have a solution y which will be relatively prime to q — 1 because it is prime 
to the factors /; and to the factor g — 1. This solution y determines a generator 
o’ for which (o’)’ = ¢'” =¢. The correspondence o’ — w then maps ® iso- 


morphically upon “K*, and leaves ¢ and its powers fixed, as was asserted. 
In view of this construction, Theorem 9 now becomes the following assertion. 


TueoreM 11. Let F be a field of characteristic p with finite subfield §, of q 
elements, which is algebraically closed in F. Then an Abelian multiplicative 
group R is the radical of a separable extension K of F if and only if 

(i) R contains the multiplicative group F* of F. 

(ii) The elements of finite order in R form a cyclic group of order q — 1 for 
some integer f. 

(iii) R/F* has every element of finite order. 

Observe, however, that we no longer have an analogue of Theorem 8 (the 
uniqueness theorem) for a radical as described merely by the properties (i), (ii), 
(iii) of this theorem; in other words, the requirement of Theorem 8 that the 
mapping ¢ of the radical R upon another radical R’ be an isomorphism of the 
field K is essential. This is because the above interpretation of the multiplica- 
tive group § as the group “K* of a field can usually be performed in many different 








at in 
| this 
e the 
ed in 
ver F 
L. 
with 
» the 
such 
this 
ment 


a 1), 


or of 


rime 
rator 
iso- 


tion. 


of q 


ative 


1 for 


(the 
(ii), 
, the 
r the 
lica- 
rent 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 145 


ways, which are not all equivalent under isomorphisms of K. This is the case, 
for instance, if gq = 3, f = 2. 


CHAPTER II 
GALOIS THEORY OF RADICAL EXTENSIONS 
7. Conditions for normality. For a given radical extension K/F every auto- 
morphism S, with u < u*, induces an automorphism o of the corresponding 


extension ‘K/}. Furthermore, each radical r must be mapped upon another 
radical r° with the same divisor so that S determines a set of constants a = a(S, r) 


with 

(1) r° = a(S, r)-r, where a(S, r) ¢ K*. 
The product ST of two automorphisms will mean: first apply 7, then S, so 
that r‘“” = (r7)*. One then has 

(2) a(S, r)a(T, r)’ = a(ST, 1); 


while, for the product of two radicals, 

(3) a(S, rr’) = a(S, r)a(S, r’). 

If r is a radical of order m, with r” = bz, for b in K and z in F, the application 
of S to this equation proves that 

(4) fa(S, r)|” = ,. (r™ = ba). 
In particular, if r is in F, then a(S, r) = 1. 

The Galois group of a normal extension K/F will be analyzed in terms of 
the function a(S, r) with these properties (2), (3), and (4). If K is a finite 
radical extension, generated by h radicals r;, as in (9), (10), and (11) of §5, 
then the function a(S, r) is essentially determined by the / partial functions 
(5) a(S) = a(S, ri), i=1,---,h. 
From these, the other values of a(S, r) may be computed, using (3) and (4). 
In the presence of the necessary condition (4) these functions determine the 
automorphism in the following sense. 


Lemma 5. Let K/F be a separable radical extension generated by radicals 


ry, °** ,1n of orders m,, +++ ,m,, as in §5; let o be an automorphism of the coeffi- 

cient extension K/%; and let the constants a, , --+ , a, in K satisfy the conditions 
. . = . 

(6) ay = bf", i=1j-++yh. 


Then there exists one and only one extension S of o which is an automorphism of K 
over F with the property 


(7) f= O6:, pet wna 





146 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


Proof. Any radical r of K has the form 


(8) r= rj! --- rp*cy, 0S5¢e¢<m,, ceK, yeF., 
, > ° ° s 
We define a corresponding radical r° by 
Ss —1 
(9) r = ay! +++ ajtc’ r. 


The assumed condition (6) insures that this formula holds for any exponents e¢; , 
even if e; = m;. Thus (9) gives an automorphism r < r* of R to itself which 
realizes the prescription (7). Theorem 8 then gives the extension of this group 
automorphism to one of the field K. 

We next derive a condition for normality which bears a close resemblance 
to a theorem due to Albert, giving conditions that an extension of a p-adic field 
be normal ([{1], Theorem 6). 

THEOREM 12. A finite separable radical extension K/F is normal if and only 
if the following conditions hold: 

(i) The coefficient extension K/¥ is normal; 

(ii) K contains all e-th roots of unity, where e is the exponent of the associ- 
ated group (R)/(F) of divisor classes; 

(iii) For each automorphism o of K/§ and for each radical r of order m with 
r™ = bx, for b in K, x in §, the constant b”" is an m-th power in K*. 


Remarks. Essentially the same theorem holds for an infinite extension, if 
(ii) is appropriately modified to require that all m-th roots of unity be in K 
for each order m of a divisor of (R)/(F); alternatively, one may still use condi- 
tion (ii) as stated, if the exponent e be interpreted as a “Steinitz G number” 
(the formal l.c.m. of the orders of all elements of (R)/(F), written as a possibly 
infinite product of primes with possibly infinite exponents). 

If a finite extension K/F is generated over F(K) by radicals rm, , --- , r, as 
in §5, the condition (iii) of this theorem may be replaced by the parallel condi- 
tion applied to these radicals alone: 


see ~ wi» . ear ‘ 
(iiia) For each o, bf is an m,-th power in *K*, for i = 1, 2, --- ,h. 


Proof. Assume first that K/F is normal. The condition (i) is immediate, 
and (iii) follows from the equation (4) deduced above. Finally, if e is the 
exponent of (R)/(F), there is a radical r of order m = e which satisfies over 
F(%) an equation f° = bx which is irreducible over F(X). Since one root r of 
this equation lies in K, the other roots t‘r also lie in K, where ¢ is a primitive 
e-th root of unity. By the assumption of separability, e is prime to the charac- 
teristic. 

Conversely, it is clear that condition (iii) implies its special case (iiia), so 
that it will suffice to show that (i), (ii), and (iiia) insure the normality of the 
extension K = F(X,r;,---,7,). Letr = r; be one of these generating radicals, 
satisfying an equation t” = bx over F(X), for b = b;, m = m;. Over F this 
radical satisfies the equation 


sf) = I] @ — 0’). 

















A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 147 


By assumption, b’* has the form (a,)” for some constant a, in K, and K con- 
tains a primitive m-th root of unity ¢. One then computes that the quantities 
a.r, for 7 = 0,1, --- ,m — 1, inelude all the roots of the polynomial f(é). 
They all lie in the field K, so that K, as generated by the normal extension 
‘kK and the roots r; of these polynomials f(t), is itself normal over F,, as asserted 
in the theorem. 


8. Characters and the Galois group. Let H be the Galois group of a sepa- 
rable normal radical extension K over its coefficient extension F(X). This 
subgroup H of the whole Galois group depends essentially on the (ordinary) 
characters of the divisor class group D = (R)/(F). Such a character is essen- 
tially a function C which defines for each radical r in R a value C(r) in the 
coefficient field K such that 


(10) C(rr’) = C(r)C(r’), 
(11) (r)e(F) implies C(r) = 1. 


By the last theorem, the presence of a radical r of order m in K insures the 
presence in *K of all m-th roots of unity; hence every linear character which 
could be realized in an algebraically complete field (of the same characteristic 
as “K) can already be realized in %. Two characters C,, Cz have a product 
C,C2 defined by 


(12) C,C.(r) = Ci(r)C2(r), reR. 


Under this multiplication, the characters C form a group X; in case D is finite, 
X is isomorphic to D. 


TueoreM 13.’ If a radical extension K is normal and separable over F, then 
each character C of its associated divisor class group (R)/(F) yields an auto- 
morphism u <> u° of K over F(X), determined by the formulas 


(13) r® = C(r)-r, for each r € R: 


This correspondence of characters to automorphisms is an isomorphism of the 
character group X to the whole Galois group of K over the coefficient extension F(X). 


Remark. In the case of a finite extension K/F, the conclusion of the theorem 
means that the Galois group of K/F(X) is isomorphic to the divisor group, 
regarded as (isomorphic to) its own character group; for the infinite case the 
character group is more useful. 

Proof. Let S be a fixed automorphism of K/F(K). According to (3) the 
function a(S, r) is a homomorphism of R to the multiplicative group of K*. 
By (4), any radical r of order 1 is mapped onto 1, for in the present case the 
induced automorphism o of XK is the identity. Therefore the function can be 
interpreted as a character Cs(r) = a(S, r). According to (2), with ¢ = 1, the 


® This theorem, in a form using characters, was first proved by Baer. His statement is 
given by the equivalence of the properties (1) and (3) of his Theorem 7.1, in [3.] 








148 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


product of automorphisms corresponds to the product of characters. Since the 
whole field K is generated by radicals, distinct isomorphisms must have different 
effects on at least one radical. We conclude that the correspondence S — Cs 
does map the automorphism group of K/F(K) isomorphically on part of the 
character group C, and that the relation (13) does hold for these automorphisms. 

It remains only to show that every character can be realized by such an auto- 
morphism. If K is finite, this is immediate, for the number of characters equals 
the order of (R)/(F*), which in turn (Theorem 5) equals the degree of K/F(X). 
If K is infinite, it may be represented as the join of the set of all its subfields 
K, which are finite over F(K). Each of these subfields is a radical extension 
(Theorem 6), hence by Theorem 12 is normal over F. Now let C be any char- 
acter of R; it clearly induces a character C, on the radical R, of K, , for every v. 
By the finite case, each such character C, gives one and only one automorphism 
S, of K,/F(K). If K, C K,, the uniqueness of the automorphism S, shows that 
S, must be a prolongation of S,. The family of automorphisms S, may there- 
fore be combined to give a single automorphism S of K, which agrees with 
each S, on the corresponding subfield K,. The map S — Cs of this composite 
automorphism S is exactly the given character. This completes the proof of 
the theorem. 

Corotiary. If K is an infinite extension, the correspondence of characters to 
automorphisms is continuous in both directions. 

Proof. X is the character group of a discrete Abelian group (R)/(F) of divisor 
classes. Its topology is usually defined by the following complete system of 
neighborhoods of unity. Each finite set Ro of radicals determines a neighborhood 
N(R,) consisting of all those characters C for which 


C(ro) = 1 whenever 7) € Ro. 


Because every divisor class has finite order, one obtains an equivalent complete 
system of neighborhoods of 1 if one restricts Ry to be a subgroup of finite order 
over F*2k*. On the other hand, each subfield Ko of finite degree over F(X) 
defines a neighborhood of unity in the Galois group, consisting of all those auto- 
morphisms S for which 

“ue=U whenever u € Ko. 


But we know that the finite groups Ry of radicals correspond to the finite sub- 
fields Ky = F(X, Ro); it follows that the correspondence S — Cs maps one 
system of neighborhoods on the other, and hence is indeed bicontinuous. 


9. Analysis of the Galois group. Any separable normal extension K of a 
function field F decomposes naturally into two stages: from F to F(X) to K. 
If H is the Galois group of K/F(K) and T the Galois group of K/%, the whole 
Galois group G of K/F is a group extension of H by I, in the sense that G has H 
as a normal subgroup and G/H = I as the corresponding factor group. Such 








e the 
erent 
+> Cs 
f the 


isms. 


2uto- 
juals 
(XK). 
fields 
1sion 
shar- 
ry v. 
hism 
that 
1ere- 
with 
osite 


of of 


rs to 


visor 
n of 
nood 


of a 
> K. 
hole 
as H 
such 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 149 


an extension may be described by choosing in G for each automorphism o of I 
a representative U’, which induces ¢. Every element of G then has the form 
SU, , for some Se H, aeQ. If H is Abelian, the multiplication of these ele- 
ments is determined by a table 

(14) U.TU,' = T° (TeH,ceT), 
(15) UU. = See er (each S,., € A). 


Here (14) indicates that transformation by each o determines an automorphism 
T — T” of H; it is independent of the choice of U and has (T’)’ = 7”. In 
(15) S.., is a factor set’ of T in H. We wish to find the special conditions 
holding for these automorphisms and factor sets in the case when K is a radical 
extension. 

THEOREM 14. If K is a normal radical extension of F, then each character C(S) 
of the Galois group H of K/F() into the field K has the following properties 
(I) C(S’) = [C(S)/’, for each SSeHandceT; 
(IT) Clit ~~ 3 in” XK. 

Proof. We already know that the function a(S, r) of (1) for fixed-S in H 
is a character of (R)/(F), and that the various automorphisms S in H give all 
such characters. Therefore, for fixed r, C.(S) = a(S, r) is a character of H 
in °K, and all characters have this form, by the duality between a discrete 
Abelian group and its character group. Repeated applications of (2) now give 


a(U,SU;", r) = a(U,, r)a(S, r)’a(U;", r)’ 


a(U,U;", r)a(S, r)’ = a(S, ry’. 


This proves (I). To obtain (II), apply (2) to both sides of the equation 
a(U,U,, r) = a(S,,,U.,, 7). The result is 


a(S,.,7r) = a(U,, rja(U,, r)’/a(U.,, 7). 


The factor set on the right is indeed similar to 1 in XK. 
The condition (II) may be put into a number of different but essentially 
equivalent forms. If the Abelian group H is finite, and has a basis of auto- 


morphisms S, , --- , S; of orders m;, --- ,m,, then the group of characters is 
generated by the characters C; with 
Ci(S;) = ¢ ’ Ci(S;) _ l, i# 5 


where ¢; is a primitive m,-th root of unity in K. The condition then has the form 


(IIa) Ci(Ser) ~ 1 in K,i = 1,---,h. 


10 For the theory of factor sets, see, for example, [21], or [12]. 
11 f,., ~ 1 means that the factor set fs,r of elements in <K is similar to 1; i.e., that there 
exist constants az in:K* such that fe,r = de@r/dor. 








150 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


In terms of a set of generating radicals 7, ---,7, this may also be stated as 


(IIb) a(S,7,7i) ~1 in K,i = 1,---,h. 
If each automorphism is expressed in terms of the generators, the factor set 
becomes 

S. _ Siren Ghar wai Siren 
where, for each i = 1, ---,h, fic, is a factor set of integers modulo m;, relative 


to suitable automorphisms induced by IT on these integers. The condition (II) 
then takes the form 


(IIe) (C(S)%*" ~ 1 in K,i = 1,--- ,h. 


One may also observe that the conditions (I) and (II) do not depend essen- 
tially upon the choice of the representatives U’, used in (14) and (15) to describe 
the group G; the validity of (I) and (II) for any one set of such representatives 
implies the validity for any other set. 

The main result of our analysis is the conclusion that the conditions (I) and 
(II) are not only necessary, but also sufficient for the realization of such a group 
extension by a field generated by radicals. This result may be stated in full as 
follows, for given F and §. 

THEOREM 15. Let a finite group G be represented as in (14) and (15) as an 
extension of its normal subgroup H by the factor group T = G/H, where 

(i) I is the Galois group of a finite separable normal extension XK of §, 
(ii) H is an Abelian group with exponent e prime to the characteristic of §, 

(iii) “K contains all e-th roots of unity, 

(iv) The group X of characters of H is isomorphic to a group D/(F) of divisor 

classes from F. 
Then G can be realized as the Galois group of a normal separable radical extension 
K/F with coefficient field K if and only if each character C of H in K satisfies the 
conditions (1) and (II) of Theorem 14. When these conditions hold, a field K may 
be so constructed that D is the group of all divisors which become principal in K. 


Remark. One may state a companion theorem in which § but not F is given. 
Since one may construct a field F which realizes any specified group of frac- 
tional divisor classes, say by making F a field (2, --- , 2) of rational func- 
tions” of n indeterminates, this theorem would differ from the present one only 
in the omission of hypothesis (iv). However, for an arbitrary F, hypothesis (iv) 
is necessary. 

Proof. The conditions (I) and (II) are already known to be necessary; hence 
we need only assume them valid and construct a corresponding field. Let the 
divisors d; , --- , d, of orders m, , --- , m, be a basis of the finite Abelian group 
D/(F), so that dj‘ = (2,) for elements z; in F. Because of (iv), one may 


12 Observe that we consider here the group of (abstract) fractional divisors, not the 
group of arithmetic divisors. 





d as 
i 


* set 


tive 


(II) 
- 
sen- 


ribe 
ives 


and 


oup 
ll as 


> an 


2sor 


s70n 
- the 
nay 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 151 


interpret each divisor d of D as a character d(S) of H in K. According to 
condition (II), there are elements a;(c) in XK such that 


(16) di(S.,.) = ax(a)a;(r)’/a;(or). 
Because d; is an element of order m; in D/(F), the m,-th power of this equation 
gives 

1 = d7"(S...) = ai(o)”™*a;(r)"**/a(or)™". 
This asserts that the function f;(¢) = [a;(c)]"* is a crossed character of the 
Galois group T in the normal extension K/%. Noether’s principal genus theorem 
applies to this case, and states that the crossed character is a unit character, 
so that there exist constants b; in K* for which 
(17) ai(a)"* = bf" (all o in Yr). 


These elements b; are used to construct the field, just as if they were the con- 
stants of the analogous equations (4). Observe that each divisor d; of order m; 


has dj'* = (2,) for some element z; in F. Construct K by adjoining to F(X) 
. 13 
roots 71, °-* ,7 Of the h equations 
ft = ba; , ss erm | 
The field K = F(K, r,, -+- , 7») so constructed is a radical extension with the 


given group D as its associated group of divisors (Theorem 5, Corollary). The 
choice of b; in (17) insures that it is normal (Theorem 12). 

The Galois group G’ of K/F is described as in Theorem 14. Each auto- 
morphism S of the given group H may be interpreted as a character d — d(S) 
of the class group D/(F), and hence as an automorphism S’ of K/F(K), with 
(18) r’’ =d(S)r, d= (nr), 
just as in Theorem 13. We may identify S with S’ and therefore H with the 
Galois group of K/F(K). The factor group I’ = G’/H is the group of K/§, 
so may be identified with the given finite group I. 

Further identification depends on the automorphisms and the factor sets. 
For this, we must first select in G’ appropriate representatives U’ of the auto- 
morphisms o of IT. Because of the choice (17) of the b; , the construction of 
Lemma 5 gives such automorphisms U; with 


(19) . ri? = a(o)ri, i=1,-+-,h. 
Computation with (18) and (19) gives 

ri** = [d(S)riJ"* = [d(S)as(o)r: , 
27% = [ao)riJ” = [d(S*)Jai(o)r: . 


rs 


18 This construction is not uniquely determined, because of the many possible choices 
for the elements 2; . 








152 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


These results are equal by condition (1), applied with C = d; ; hence 
(20) US = SU, (S in H, o in P). 


A similar computation shows that 

(21) UU, = SeeUer 

The results (20) and (21) mean that the multiplication table for G’ is the same 
as the table (14) and (15) for G. Hence the group G’ of the radical extension 
is essentially the given group G. 


Corouiary 1. Jf D is a divisor group of finite exponent e in F, with e prime 
to the characteristic of F, then there is a separable Abelian extension of F in which 
every divisor of D becomes principal if and only if all e-th roots of unity lie in F. 
When this condition is satisfied, one can find a normal radical extension of F 
which has no coefficient extension, and in which all divisors of D become principal 


This corollary emphasizes the extent to which the theory of radical extensions 
is not a theory of Abelian extensions, although in the classical case of algebraic 
number fields the study of the principal ideal theorem centers around Abelian 
class fields. 

Proof. lf D becomes principal in a normal extension K, it already becomes 
principal in some radical extension K’ C K. If there is some e-th root of unity 
not present in F, this root appears at least once as the value of a character C(r) 
for K’. The equations (14) and (I) then show that the Galois group of K’ is 
not Abelian. Hence our condition is necessary. 

Conversely, assume all e-th roots present, and apply Theorem 15 with K = §. 
The construction of this theorem does yield a radical extension in whigh D 
becomes principal; in the present case the assumption “K = § insures that the 
Galois group is the group X of characters, hence is Abelian. 

Corouiary 2. If D is a finite divisor group of exponent e in F, with e prime 
to the characteristic, then there exists a finite normal extension K/F in which every 
divisor of D becomes principal, and in which the coefficient field K of K is obtained 
from § by adjunction of all e-th roots of unity. Furthermore, the Galois group of 
K/F is metabelian. 


Proof. This is an immediate corollary of the preceding construction. 


10. Computations for crossed characters. Our next aim is to show that any 
normal extension K with a group which satisfies the essential conditions (I) and 
(II) is necessarily a radical extension. To this end, we use the decomposition 
of the group G into H and fT = G/H to study the crossed characters intro- 
duced in §4. 

Let cs = c(S) be a crossed character of G in K. Any element of G has the 
form SU, , for S in H, o in T; while by the definition of a crossed character 


(22) e(SU,) = c(S)c(U.) (S in H). 





ame 


sion 


rime 
hich 
n F. 
fF 
‘pal 


ions 
raic 
lian 


mes 
nity 
(r) 
is 
= ay. 


1 D 
the 


7me 
ery 
ned 


) of 


ny 
und 
ion 
TO- 


the 


H). 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 153 


Hence c is completely determined by its values in the subgroup H and by the 
values of the auxiliary function c, = c(l’.). One shows at once that these func- 
tions satisfy the conditions 


(23) e(S)e(T) = c(ST) (S, T in A), 
(24) [e(S)]’ = e(S’) (Sin A), 
(25) Cl, = C(Se.c) Cer « 


Conversely, given functions c(S) and ¢c, , for S in H, which satisfy these three 
conditions, one computes at once that the formula (22) does give a crossed 
character, according to definition. The crossed characters therefore correspond 
to the pairs of solutions (c(S), ¢,) of (23)-(25). The unit characters correspond 
to solutions with c(S) = 1, c, = b'’, for some b in X. 

Suppose now that c(S) for S in H is a solution of the first two equations (23) 
and (24). If the final equation (25) has, for given c(S,,), two solutions c, 
and c, , then their quotient c, ‘c, is itself a crossed character of T in XK. By the 
principal genus theorem, this quotient has the form c,/c, = b'’, so that ¢, = 
bc, .. This means that two different solutions of (25) simply give associate 
crossed characters of G. Therefore classes of crossed characters of G in XK 
correspond to those distinct solutions of (23) and (24) for which (25) can be 
solved. But the requirement that (25) have a solution is just that c(S,,) ~ 1 
in K. This is condition (11) of Theorem 14, while (24) is just condition (1) of 
that theorem. We thus have proved 


THEOREM 16. If a finite separable normal extension K of F has Galois groups G 
over F and H over F(X), then the group of classes of crossed characters of G in K 
is isomorphic to the group of all those characters of H in K which satisfy conditions 
(I) and (II) of Theorem 14. 

In any event this shows that there can be but a finite number of classes of 
crossed characters. Suppose now that conditions (I) and (II) hold for all char- 
acters of H in kK. This means that there is a full complement of crossed charac- 
ters, equal to the number of characters of H in K. When H is Abelian and K 
has all requisite roots of unity, this means that the number of classes of crossed 
characters is just [K:F()]. But Theorem 3 showed that the number of classes 
of crossed characters is [(R):(F)]. Therefore [(R):(F)] = [K:F()], which is 
enough to insure that K is a radical extension. This proves 

Coro.uary 1. A finite separable normal extension K/F is a radical extension 
if and only if the following conditions all hold: 

(i) The group H of K/F(XK) ts Abelian, with exponent e prime to the charac- 
teristic of &; 

(ii) All e-th roots of unity are present in K; 

(iii) The Galois group satisfies the conditions (1) and (II) of Theorem 14. 


In other words, a radical extension can be identified by consulting simply 
properties of its group, of its coefficient extension, and of the portion of its group 
leaving invariant this coefficient extension. 








154 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


CHAPTER III 
ARITHMETIC OF SPECIAL RADICAL EXTENSIONS 


11. Finite coefficient fields. In this chapter we aim to show that the general 
determination of all possible Galois groups for radical extensions, as given in 
Chapter II, can be used to get explicit formulas for these groups in a variety of 
special cases. 

Let § be a finite field of g elements, algebraically closed in F, while K is a finite 
separable radical extension of F. The coefficient field K of K then has 
elements, where f = [K:§]; it consists of 0 and of powers of a suitable single 
element £ “K/§ is cyclic, with a generating automorphism \ which maps any 


ain kK on a‘. The generating equations for the radical extension K = F(X, 
m1, °**, 1) may now be written as 
(1) ret = £7, t=1,---,h, g; an integer. 


Application of the conditions of Theorem 12 shows that K/F is normal if and 
only if 


(2) a — 1 = (q — 1)gi = 0 (mod m,), t=1],---,h. 


In the group (R)/(F) of divisor classes associated with K we let r* denote the 
class belonging to the radical r. The Galois group of a normal K/F may be 
determined explicitly in terms of (R)/(F), the constants q and f, and the para- 
meters g; of the generation (1), as follows: 

THEOREM 17. G is an extension of its Abelian normal subgroup H = (R)/(F) 
by a cyclic group of order f. For a suitable representative U of the generator of 
this cyclic factor group, G is determined by the formulas 


(3) Ur*U = (r*)!, 
(4) Ul = (rf) --- (rf). 


Proof. First, the subgroup H = (R)/(F) is generated by automorphisms 
corresponding to the characters C; with 


(5) Cir) =&, Cir) =1, tj. 


Here t; = (q — 1)/m; is an integer, by (2), so that £" is a primitive m,-th root 
of unity. In the second place, the automorphism a <— a* of K/§ may be ex- 
tended by Lemma 5 to an automorphism U for which 


ry = Er; ’ &; = (q o's 1)g:/m; ’ 


where s; is an integer by (2). One then computes that U’ multiplies r; by &*", 
so that the automorphism U of H corresponds in effect to the character 


C = CY --- Ct, which does map r; on £”**. Replacement of the characters 








1eral 
n in 
y of 


inite 
s gf 
ngle 
any 


"(K, 


sms 


root 
ex- 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 155 


by the isomorphic classes of divisors r* gives the equation (4), while (3) is 
immediate. 

This result is of interest because of the close parallel to the groups arising for 
p-adic fields. In the special case when the group (R)/(F) of divisor classes is 
eyclic, the Galois group G described by (3) and (4) is exactly the Galois group 
of a normal extension of a p-adic field with a finite residue class field § and with 
ramification order prime to p. The structure of these latter groups is given in 
Albert [1], Theorem 9. 


12. Universal groups over finite fields. Instead of describing the possible 
Galois groups of a radical extension directly, as we have done, one may also try 
to describe them as the possible homomorphic images of a “universal” group 
which is the group of some universal (infinite) radical extension. In this descrip- 
tion the universal group can even be replaced by a subgroup everywhere dense 
in the whole group, as was the case in Schilling’s investigations of certain regular 
extensions of complete fields [14]. This construction will be carried out below 
for the case of a finite coefficient field. 

Let F be a function field over its algebraically closed finite subfield §, of q 
elements, while D is a divisor group of F, of order prime to the characteristic, 
and with a basis d,,---, d for D/(F). Let d; have order m;, and choose 
z; in F so that 


(6) d;* = (x), t¢=l1,---,h. 


Embed § in its algebraically complete algebraic extension W,, and let 
W., = F(@,,) be the corresponding coefficient extension of F. The group D 
can be interpreted as a group of divisors in W, , and Theorem 7 asserts that 
there is one and only one radical extension of W ,, with divisor group D. This 
field K, could also be described as the unique radical extension of F with 
divisor group and with coefficient field W,; it may be generated, as 


K, = K(®,,, 81, +++, 8), by special radicals s; with 
y f m . mr 
(7) S' =m, °°, S& =X. 


This field K ,, satisfies the conditions for normality over F (Theorem 12). Any 
radical extension of F with the same associated divisor group D, as K , is, by 
Theorem 7, a subfield of K, ; indeed, K,, might be defined as the composite 
of all radical extensions of F having this D. This field K , , described in any 
one of these fashions, will be our universal field, for given F and D. 

To determine the Galois group of K , , we first consider the group of its co- 
efficient extension W,. This involves the n! integers (van Dantzig [5]), ob- 
tained by imposing on the ring of ordinary integers the topology in which a 
complete system of neighborhoods of zero is given by the principal ideals (2!), 
(3!), ---, (n!),---. The integers, completed with respect to this topology, 
yield the n!-integers; each such integer a is then a limit of a sequence of ordinary 
integers a; , so that for any given integer f, a; will have a constant remainder 








156 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


modulo f for 7 sufficiently large. For any constant a in the field W@W, (with 
qf P ° a P ° 

a” = a for some f), the sequence a‘, with e; = q*‘, will then ultimately be a 

fixed element of W, , which we write as 

q* . ay . 

a* = lim a’ (a = lim a). 


t-?2 >o 


The correspondence a — a™ is an automorphism of 8, which may be regarded 
as the symbolic power o* of the automorphism a — a’ = a‘. These auto- 


morphisms 


(8) a—a =a‘, aeW a an n!-adic integer, 


2? 


constitute the whole Galois group of W_/%, so that this group is an “ideal 
° ” Tr . . a gs a+s 

cyclic group”. They are multiplied by the rule c*s” = o*™, and the topology 
in the Galois group is exactly the topology given for the exponents a, as one 
may see by approximating YW, by the sequence of finite subfields §, by degree 
n! over §. The Galois group of K , may now be described, as follows, in terms 
of the generating elements of (7). 

THEOREM 18. The extension K ,/F has for each character C of its divisor class 
group D/(F) an automorphism u — u“ described by the specifications that 

’ Y Y iz 

(9) s; = C[(s,)]s;, a =a (ae XK). 
For each n!-adic integer a there is also an automorphism U“ of K ,./F with 


(10) & = &, a” = a" (a eK). 


t 


The Galois group G,, of K ,,/F consists of all the products CU, multiplied by the 
table 


(11) UU =U, = UCU" = CC, 


where the C’s are multiplied like characters. The topology in G,, is determined 
by taking for each integer n a neighborhood of the identity consisting of all S = CU* 
with C = I and a = 0 (mod n!). 

Proof. Since the field K , may be regarded as the composite of the two sub- 
fields F(W,_) and F(s,,---, 8), the formulas (9) and (10) are sufficient to 
completely determine these automorphisms C and U™*. Each automorphism 
o* of @W,/F has a unique extension U* to an automorphism of K./F which 
will leave fixed the elements s; , --- , s, which generate K , over F(%,); this 
extension U“ has the properties of (10). By Theorem 13 the group of K, 
over F(%,,) is essentially the group H of all characters C, as in (9). Every 
automorphism does have the form CU, since the U* exhaust the automorphisms 
of F(%,,)/F, and the formulas (11) follow from the corresponding formulas for 
finite extensions. Finally, the neighborhoods of the identity in G, may be 
determined as the sets of automorphisms which leave fixed the respective finite 
subfields generated by s, , --- , 8, and the subfields §, of W,. This gives the 
topology, as described in the theorem. 








with 
be a 


rded 
wuto- 


eger, 


ideal 
logy 

one 
gree 
rms 


class 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 157 


Consider the subgroup G) of G,, which is generated by the subgroup H of 
characters and the autgmorphism UU’. This subgroup G) then consists of the 
distinct automorphisms CU“, for C any character of D/(F), and k any integer. 
The multiplication table is simply 


(12) UCU = C* (C in H), 
so G is a cyclic extension of the character group H of D. 


THEOREM 19. The Galois group G of any finite separable normal radical ex- 
tension K with the divisor group D is a homomorphic image of the group Go of (12). 


Proof. Since K C K,,, by the universality of K, , the group G is certainly 
a homomorph of G,,. Hence we need only show that every automorphism of 
G can be induced by an automorphism of the subgroup G CG, . This may be 
proved, either by appeal to the known structure of G in terms of characters 
(Chapter II) or by observing that G) is everywhere dense in the topology of 
G,,. Since a subgroup is everywhere dense in an infinite Galois group if and 
only if the subgroup induces all automorphisms of every finite extension (Schill- 
ing [14], Lemma 5), the desired result follows. It means that G is a “universal’’ 
Galois group for finite radical extensions with the group D. 

It is possible to construct various other universal groups. For example, one 
may consider only those radical extensions in which the coefficient extension K 
is generated over § by e-th roots of unity, where e is the exponent of D. Fora 
fixed finite group D these fields are all contained in a universal field which has 
a finite universal group. 


13. Restricted radical extensions. The question as to the variety of possible 
Galois groups for a radical extension may be formulated more sharply if one 
restricts attention to those extensions in which the attendant coefficient extension 
is as small as possible. If D is a given finite group of divisors, with exponent e, 
from a field F, then every normal radical extension in which the divisors of D 
become principal necessarily contains the field Ko = §(¢) generated by a primi- 
tive e-th root of unity ¢. We say that the radical extension K is restricted if 
it is normal and separable and if its coefficient extension *K is exactly this field 
‘Ko. For given D, the Galois group of such a restricted extension K is obtained 
from two known components: 

First, the group H, which is essentially the group of all linear characters C of 
D/(F) in Ko. 

Second, the group T of K/¥. If & is the intersection of § and the subfield 
of “Ko generated by ¢, I may also be described as the Galois group of {(¢)/£. 
itself is a finite field or a field of algebraic numbers, so [ is known, as it is a 
subgroup of the appropriate cyclotomic group. 

Our problem is the determination of those extensions G of H by T which can 
be realized as Galois groups of restricted radical extensions. This problem can 
be treated in two parts; first, what are the different classes of (similar) factor 
sets for such extensions; second, which of these factor sets satisfy the requisite 








158 SAUNDERS M:A\C LANE AND O. F, G. SCHILLING 


condition (II) of Theorem 14? The first part is purely group-theoretic, and the 
second is essentially a question in the splitting of a liaear algebra determined 
by the factor set. In the case when § has a finite characteristic, the Wedder- 
burn theorem that there are no proper division algebras over a finite field in- 
sures that the factor sets are always similar to 1 in &(¢) and hence in Ko, so 
that the second part of the problem is vacuous in this case. 

We shall consider in particular conditions under which the Galois group is 
uniquely determined by the group D. When this is the case, the Galois group 
is that extension of H by T in which the factor set is identically 1. By Theorem 
14, this particular group extension can always be realized; for example, as the 
group of the restricted extension K = F(Kj, s,---, 8) generated by the 
special radicals s; of (7). 


14. Group construction when the exponent is an odd prime power. 

TuHEorEM 20. If the group D/(F) of divisor classes is cyclic of order p", where 
p is an odd prime, then the Galois group G of a restricted radical extension with 
divisor group D is uniquely determined by the coefficient field § and the order e = p" 
of D. 

Proof. Let Z be the multiplicative group generated by ¢, a primitive p"-th 
root of unity. Every automorphism of I is then an automorphism of the group 
Z; since the latter automorphisms constitute the cyclic group of residues mod 
p" which are prime to the odd prime p, the group I must be cyclic. Let a 
generator of T be the automorphism ¢ with 


(13) v=o k an integer. 
First consider the order of the automorphism o of (13). By cyclotomic theory, 


its order must be a divisor of ¢(p") = (p — 1)p"’. Suppose in particular that 
the integer k of (13) has k = 1 (mod p). Write 


(14) k=1+ up’, u # 0 (mod p), 8 
Because p is odd (or, for p = 2 and s 2 2) the binomial theorem will give 


k? = 1+ vp", v # 0 (mod p). 


IV 


Repeated applications of this, for an exponent m # 0 (mod p), prove 
(15) k?'™ = 1 + wp"*, w # 0 (mod p). 


The order of the automorphism @ of (13) is the least power p'm > 0 for which 
k?’™ = 1 (mod p"). We thus conclude that if k = 1 (mod p’) and k # 1 (mod 
p’*'), the order of the automorphism a is p”* 

The group Z of p”-th roots of unity may be regarded as the group of char- 
acters of the cyclic group D/(F) of divisor classes. By our general theory, the 
Galois group G of a restricted radical extension is thus a group extension of the 
cyclic group Z by the cyclic group T of order g. Since I is cyclic, G may be 





'p is 
‘oup 
rem 
the 
the 


here 
with 


n 


° p 
"-th 
oup 
nod 
t a 


ger. 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 159 


represented as the set of all elements “U’, 0 < i < p",0 <j < g, where U is 
a representative of the generating element o of I’, with a multiplication table 
(16) UsU' =e", Ur =e. 


Here g is the order of I’, and c is a constant of Z which satisfies the condition 


(17) c=. 
By changing the choice of the representative U of o, c may be changed to 
(18) of = ohitet str" a oN, 


where b may be any element in Z. We propose to show that G is uniquely de- 
termined as the extension with factor set 1; that is, with c = 1. 

Suppose first that in (13) k ¥ 1 (mod p), and that ce = ¢”. The condition (17) 
then gives (” ” = ¢”” = 1, hence y(k — 1) = 0 (mod p"). Since k-—140 
(mod p), the only solution for y is y = 0 (mod p"). Thusec = ¢” = 1, as asserted. 

Suppose next that k = 1 (mod p*), but k # 1 (mod p*"’). Then the solutions 
c = ¢” of the condition (17) have y = 0 (mod p”“*). On the other hand, in 
(18) the norm Nb of b = & is ¢ with an exponent x(k’ — 1)/(k — 1). Since 
the order g of o in this case is p" *, the exponent of ¢ in this formula may by 
(14) and (15) be rewritten as rwp**"*/(up’) = xp" “*w/u= 0 (mod p”*). 
c = ¢” may thus be regarded asa norm Nf. Therefore in (18) the constant c’ 
may be reduced to 1. 


Coro.tiary. For a restricted radical extension with a divisor group D of odd 
prime power order p", the condition (II) of Theorem 15 for the realization of a 
Galois group is always satisfied. 

Proof. In this case the group D/(F) is no longer cyclic, but the Galois group 
I of the coefficient field is still cyclic, just as above. The factor set for the 
Galois group G then can again be put in a form U’ = S, as in (16). The condi- 
tion (II) for the realization of G then requires that each character C(S) of S be 
anorm. But the value of this character C(S) is a root of unity c = ¢", which 
by the condition (I) must satisfy the analogue of (17). The computation above 
then shows that C(S) is indeed a norm. 

In this non-cyclic case, all potential groups can be realized, but these groups 
are no longer all identical, as in Theorem 20. We show this by an example. 
For an odd prime p first construct as coefficient field § the field of characteristic 
x» generated by all p-th roots of unity. If ¢ is a primitive p'-th root of unity, 
the degree of §(¢)/% is then p. Let F = §(x) be the field of rational functions 
in one indeterminate over §; in it one may construct a fractional divisor group 
D for which D/(F) is Abelian of type (p’, p). D has a basis of divisors d; and d, 
of orders p” and p, respectively. Its character group H is generated by char- 
acters C, and C, with 


Ci(di) = §, Ci(de) = 1; C2(di) = 1, C2(d2) = ¢. 


14 A similar example may be constructed for prime characteristic. 








160 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


The potential Galois groups are then extensions of the group H of these char- 
acters by the cyclic group T generated by the automorphism o of (13), where 
in this case k = 1 (mod p). As in (16), any such extension is given by a multi- 
plication table U” = C. One computes readily that C = C; is an allowable 
value for this constant ((17) holds), but that C2 is not the norm of another 
character; hence the extension with this constant C, does not have factor set 1. 

We observe that the technique used in this example can be readily expanded 
to give a computation for the number of different possible Galois groups for 
given D and §, in the case of a restricted radical extension. 

One may show in other ways that the uniqueness assertion of Theorem 20 
will not extend without restriction to the case when D is not of prime power 
exponent. For example, if § is a finite field of 25 elements, while D is a cyclic 
group of order 9-19 = 171, one may show that §(¢)/§ has degree 9. In 
(18), the quantities Nb are all 1, but in (17) one may use for c any cube root 
of unity. In this case there are therefore three distinct extensions of Z by I, 
all three of which may be realized as Galois groups of restricted radical extensions. 


15. Group construction in the case 2". The case of a cyclic divisor group of 
even prime power order must be given a separate treatment, in which the 
properties of quaternion algebras play an interesting role. It is desirable to 
treat separately the case when the base field has a prime characteristic. 


THEeoreM 21. Let D/(F) be a cyclic group of divisor classes of order 2", while 
iy is a coefficient field of characteristic ~. Let &% be the intersection of § with the 
field generated by the 2"-th roots of unity. If 2 is not a real field, only one Galois 
group G can be realized by a restricted radical extension with divisor group D, 
coefficient field §. If 2 is real, and if a’ + b° = —1 has no solution with elements 
a, b in §, then again only one group G can be realized. Finally, if % is real, but 
a’ +b’ = —1 hasa solution in §, then exactly two distinct groups G can be realized. 

Briefly speaking, this result shows that the Galois group in the prime power 
case is not always uniquely determined. 

Proof. Let € be the field generated (over the rational numbers) by the 2"-th 
roots of unity. The whole Galois group of € is generated by two automorphisms, 
a and 8, with 


(19) cer Pad (¢ a primitive 2”-th root). 
One has then 
(20) eo=1, 6 =1, of = Ba. 


If n = 3, the whole group is Abelian of type (2””’, 2); if n = 2, it is cyclic with 
generator a; if n = 1, it consists of the identity alone. 

First we shall consider the possible group extensions of the cyclic group Z by 
the group I, since [ is a subgroup of the whole group generated by a and 8. 


We consider various subcases. 














A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 161 


Case 1. T evelic, generated by a. The factor set is then given as in (16), 


with ¢ = a. The possible constants c, with c* = c, are juste = +1. The 
norm Nb of (18) is always 1, for by (19) Né = ¢'** = ¢¢ ' = 1. Hence in this 
ase there are two different extensions, with ¢ = +1. 

Case 2. IT cyclic, generated by ¢ = 8. By (19), the exponent 8 has the 
same effect as the exponent 5, and one may show that 5° = 1 + u2**’, where 


u # 0 (mod 2). It follows that the automorphism o generating [ has order 
2". A straight-forward computation then proves that every invariant 
element ¢ as in (17) has ¢c = ¢" with y = 0 (mod 2”*“), and that every such 
element is a norm. We conclude that in this case there is only one group 
extension. 

Case 3. T cyclic, generated by ¢ = ag”. The order of T is then again 
2” and the invariant elements are ¢” with y = 0 (mod 2”"). Each such 
is a norm N,b, so there is but one group extension. 

These three cases are the only ones in which the group [ is cyclic. The 
possible non-cyclic groups can all be treated together, as follows. 

Case 4. T not cyclic, generated by a and ¢ = 8". Insuch a case the group 
extension G of Z by T may be described by choosing representatives (7, and U’, 
for the two generators a and o of T; they will have a multiplication table 


(21) UsUS =f", Ususr =F, 
(22) U2 =a, Us = ¢,, UU, =aU.U, ; 


where s = n — k — 2 is the order of ¢. The constants c,., c, , a of (22) must 
satisfy certain associativity conditions (see Zassenhaus [21], p. 97). One of 
these conditions is that cf = c. ; it implies as in Case 1 above that c. = +1. 
The value of c, cannot be altered by the choice of a new representative for U, . 
Another condition is cf = c, , asin (17); by the technique of Case 2, this enables 
us to make c, = 1 by choosing a new representative for U,. After this reduc- 
tion has been carried out, one of the associativity conditions for the third 
constant a (obtained from the equation U"U, = U',) is 


l+o+e2+...+¢8-1 
l=Na=a”” rr" Gig 


If a = ¢’, one may compute that a satisfies this condition if and only if z = 0 
(mod 2’**). But the replacement of the representative U. by UZ = bU. 
changes the constant a to ab’'. By a suitable choice of the constant b, this 
result will be ab” ' = 1. After these reductions, the multiplication table takes 
on one of two forms 


(23) U2 = +1, Us = 1, UUs = UUs. 


Again in this case there are exactly two possible group extensions. 

The subdivision into cases may now be reformulated. There are two pos- 
sible group extensions in all those instances when the group I’ contains the 
automorphism a; in the remaining cases there is but one extension. Now a is 








162 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


that automorphism which maps each root of unity into its complex conjugate, 
so the subfield of all elements invariant under a is just the maximal real sub- 
field of the cyclotomic field. Since T is the group of ¥(¢) over ¥, where ¥ is the 
intersection of § and the cyclotomic subfield, it follows that there will be two 
distinct group extensions if and only if & is a real field. (Observe that if ¥ is 
real, all of its conjugates are real, so that this statement does not involve the 
possible ambiguity in the determination of &.) 

In those cases when two group extensions are possible it remains to deter- 
mine when the corresponding groups can both be realized. By the condition 
(Il) of Theorem 15 this is just a question about algebras: when is the crossed 
product algebra A = (§(£), T, C(S.,-)) a total matric algebra? Here S,,, is 
the factor set for the group extension, C(S,,-) is the corresponding set of char- 
acters. In the Case 4 treated above, this factor set is determined by the con- 
stants c. = +1,¢, = 1, a = 1 of (22) and (23). Because of the simple form 
of these constants, the algebra may be reduced. The whole group I is the 
direct product of the cyclic subgroup of order 2 generated by a@ and the cyclic 
subgroup A generated by ¢ = 8. One may show that the factor set deter- 
mined by (22) and (23) is “symmetric” relative to the subgroup A; a general 
theorem on crossed product algebras (see, e.g., Schilling-Mac Lane [12]) then 
asserts that the whole algebra is similar to another algebra with the same factor 
set, restricted to the group T/A. Now I/A is just the cyclic group of order 
2 generated by a. Hence the algebra under consideration is just the cyclic 
algebra 


A ~ (§(2), a, +1), 
with the multiplication table 
UsUs =-i, Ut = +1, 


where 7 is a primitive 4-th root of unity. If the constant U% here is + 1, this 
is a total matric algebra; if it is —1, it is just the ordinary algebra of quaternions 
over the field §. The latter algebra will be a total matric algebra if and only 
if its multiplication constant —1 is a norm from (2): 


-1=N(a+bhi) =ad+0 (a, b €§). 


Therefore the group in question can be realized by a radical extension if and 
only if the equation a’ + b” = —1 has a solution in. In the remaining Case 1, 
the same condition may be found, without the necessity of a preliminary reduc- 
tion of the algebra. We thus have all the results stated in Theorem 21. 
Remark. In considering the condition that the algebra A be a total matric 
algebra over the coefficient field §, one might be tempted to try to reduce this 
question to one about algebras over an algebraic number field, for in this case 
the invariants of an algebra are completely known. If % is the field of all 
algebraic numbers contained in the given field §, the algebra A = (K, I, C(S,,,)) 











A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 163 


is a scalar extension of the algebra Ao = (fo(¢), I’, C(S..,)), and the base field 
iyo Of Ao is algebraically closed in the extended base field § of A. However, it 
would be false to suppose that A is a total matric algebra if and only if Ao is one. 

This may be supported by an example. Let § be the field of rational numbers, 
and, as in Witt ({19], p. 10), let ¥} = B(a, y) with y’ = —1 — 2’ be a function 
field of one variable over 8. One may show that § is algebraically closed in §. 
However, the quaternion algebra considered above does not split over %, and 
does split over §, for —1 = 2° + y’ is a sum of two squares by the very defini- 
tion of §. 

To complete our investigation of the prime power case we state a result 
parallel to the previous theorem, but for fields of finite characteristic. In the 
statement, ¢ is again a primitive 2”-th root of unity over the field §, while & is 
the intersection of § and the field generated by ¢ alone. 


THEOREM 22. Let D/(F) be a cyclic group of divisor classes of order 2", while 
5 is a field of finite characteristic p prime to 2._ The Galois group G of a restricted 
radical extension with divisor group D is uniquely determined by § and 2”, except 
in the case when 


n> 1, gL = Z, p = —1 (mod 2"), 


where Y is the prime field of characteristic p. In this exceptional case there are 
exactly two groups possible. 


The proof uses the same methods as the previous theorems, so will be omitted. 
Incidentally, the exceptional case can also be described as that case (with 
n > 1) in which 2 = —1 has no solution in §, while the field ¥(¢) generated 
by a primitive 2"-th root of unity contains a primitive 2"*'-th root of unity. 


16. Function fields of one variable. From the classical point of view, the 
most important special case of our theory is that in which F is a function field 
of one variable over a coefficient field §. In this case one gives special atten- 
tion to the ordinary or arithmetic divisors A = P{' --- P%"; as described in the 
introduction, these are obtained from the prime divisors P; (or from the equiva- 
lent valuations) of F. Those divisors A which become principal in any finite 
extension of F necessarily have finite order (i.e., some A” is the divisor of a 
function), as was proved in the introduction. But the divisors A of finite order 
constitute a group D,, of arithmetic divisors which may be regarded as a sub- 
group of the group D, of fractional divisors. One would then be led to con- 
sider those radical extensions for which the associated group D of divisors con- 
sists exclusively of such arithmetic divisors. These extensions can be charac- 


terized as follows. 
THEOREM 23. A separable radical extension K of a function field F of one 


variable is unramified if and only if its associated group D of fractional divisors 
consists entirely of arithmetic divisors. 








164 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


Proof. Recall that a prime divisor P of F is said to be unramified in an ex- 
tension K if P can be decomposed into a product of distinct prime divisors of K. 
The whole extension K is unramified if every prime divisor of F is unramified. 

First, suppose that K is a radical extension in which a non-arithmetic frac- 
tional divisor d becomes principal. If d has order m, with d” = (x), the arith- 
metic divisor (x) = P{' --- P** then involves at least one prime divisor P; to an 
exponent e; not divisible by m. Let V; be the valuation of F corresponding to 
the prime divisor P; , so that for each function y of the field, Viy is the exact 
power of P; dividing y. The radical r belonging to the given divisor d has 
r" = bz, for b a constant. In any extension V; of the valuation to K, Vir = 
Vi(bx)/m = e;/m. Since this is a proper fraction, the valuation V; and hence 
the corresponding prime ideal P; is ramified. 

Conversely, suppose that K is a radical extension belonging to a group D of 
arithmetic divisors. It is well known that the decomposition of divisors of F 
can be found in terms of the corresponding decompositions of ideals in suitable 
integrally closed rings. Specifically, let z be any function of F transcendental 
over ff, and let ©, be the ring of all functions of F integral over §[z], while Dy). 
is the ring of all functions integral over §{1/z]._ Then” each prime divisor P 
of F corresponds either to a prime ideal p of ©, or to a prime ideal p of Dy; 
which is a factor of 1/z. Furthermore, P is ramified in an extension K if and 
only if the corresponding prime ideal p is ramified in K (that is, has a square of 
some prime ideal as a factor). However, the ramified prime ideals from a ring D 
are simply the prime ideal factors of the discriminant of the extension D* of 
(D* is the integral closure of D in K). This discriminant is in turn a factor of 
the discriminant of any integral element of K generating K over ©. 

The radical extension K under consideration may now be proved to be un- 
ramified by examination of the discriminants of suitable generating elements. 
For example, consider a separable coefficient extension K = F(K). This can 
be generated by an element of which is integral with respect to either ring ©, 
or Dy. The discriminant for this element is a constant of §; hence no prime 
ideals are ramified in this case. On the other hand let K = F(r) be generated” 
by a radical r with an arithmetic divisor (r) = A. If r” = 2, for x in F, the 
discriminant of this generating equation, relative to ©, , will involve only prime 
ideal factors of x. On the other hand, the same extension F(r) may be obtained 
by adjoining a different radical r’ = r/u, where u is a function of F with a 
divisor (u) = AB, B relatively prime to A. The discriminant for this second 
defining equation will no longer involve the prime ideal factors of x. All told, 


15 See, for example, the discussion in F. K. Schmidt [15], pp. 6-7, and p. 10. His state- 
ments, which envisage the case of a finite coefficient field §, are also valid for an arbitrary 
coefficient field. In case § is imperfect, the requisite Noether ideal theory in O may be 
conveniently established by means of valuation theory, as in F. K. Schmidt [16], §3, 3. 

16 The fact that such an extension is unramified can also be proved by arguments with 
valuations, as in Deuring [6], Theorem 9 (for the case when the m-th roots of unity all lie 
in F). 








as we A 








A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 165 


one can conclude (with similar arguments for ©,,) that no prime ideals are 
ramified. 

The general radical extension K with an arithmetic divisor group is obtained 
from F by a succession of adjunctions of the two types considered above; hence 
a combination of the above arguments proves the theorem. 

Coro.uary. Let K be a normal extension of F, with the same coefficient field 
as F, in which a group D of arithmetic divisors of F become principal. Assume 
that F contains all e-th roots of unity, where e is the exponent of D (and is prime 
to the characteristic of F). Then every divisor of D will become principal in an 
unramified Abelian subfield of K which has over F a Galois group isomorphic 
to D/(F). 

Proof. The subfield in question is the field generated by all radicals of K 
with (r) in D. The subfield is unramified by the theorem above, while its 
Galois group is given by Theorem 13. 

Remark. If K is an unramified normal extension of F with the same coeffi- 
cient field as F, and if every root of unity appearing in the characters of the 
Galois group of K over F lies in the field F, then the maximal Abelian subfield 
of K can be described as the maximal subfield of K which is a radical exten- 
sion of F. 


17. Examples of divisor groups in the arithmetic case. To show that our 
theory of the Galois groups of radical extensions applies in full form to the case 
of extensions with an associated group of arithmetic divisors, we shall now show 
that one may realize groups of arithmetic divisors of arbitrary complexity in a 
suitably chosen function field. 


TuroreM 24. If § is any coefficient field, D* any finite Abelian group, with 
order prime to the characteristic of §, then there exists a function field F of one 
variable over § which contains a group D of arithmetic divisors such that the corre- 
sponding group D/(F) of divisor classes is isomorphic to the given group D*, 
provided only that § contains a sufficiently large (finite) number of elements. 

Proof. Since the given Abelian group D* can always be embedded in a direct 
product of ¢ cyclic groups of order n, where n is prime to the characteristic, it 
will suffice to realize the latter group as a group of divisor classes. We shall 
assume that the coefficient field § has at least n + ¢ distinct elements 
@, @2,*** , Onze. 

First construct a field k = §(x) over §, and over this a function field F = 
k(y) = §(2, y), where y" = (x — a;)(x — a) «++ (a — a,), while s is an integer 
between ¢ + 1 and ¢ + n, so chosen that (s,n) = 1. Ink let (x — a;) = p;/p, 


where i = 1,---,s. In the valuation V; corresponding to the prime divisor 
pi, one has Viy = (1/n)Vi(x — aj); hence p; has the decomposition p; = P; 
in F. Since s is prime to n, one also has p, = P,. We propose then to 


consider the divisors A; = P;/P,, of order zero in F. These divisors become 
principal in the extended field K = F((x — a)"",---, (x — ay. 











166 SAUNDERS MAC LANE AND O. F. G. SCHILLING 


We shall prove that each divisor A; has order n and that they generate together 
a group of divisor classes which is the direct product of s — 1 cyclic groups of 
order n. For this purpose consider the extensions k’ = k(f), F’ = F(f), 
K’ = K(f), where ¢ is a primitive n-th root of unity. We remark that the 
A,’s become divisors in F’ of exactly the same order as in F, according to 
Corollary 2 of Theorem 3. Let w be the group of all these elements of k’ which 
have an n-th root in the extension K’. Since the n-th roots of unity all lie 
in k’, the ordinary Kummer theory (Witt [20]) asserts that the degree of K’ 
over k’ is [w:k’*"]. Since w contains each (x — aj), 

[K’:k’] = [k’*"{(x2 — a), +--+ , (@ — a,)}2k’*"). 
By the ordinary decomposition theorem for the polynomials of §’[x], the index 
on the right is n°. On the other hand, the generation of K’ by radicals shows 
that [K’:k’] Ss n°. Therefore [K’:k’] = n°. The same argument shows that 
[F’:k’] = n; therefore [K’:F’] = n*". 

Over F’, the radical extension K’ can be generated by s — 1 of the radicals, 
say by (x — a)", --- , (x — a,1)""". Furthermore the extension K’ involves 
no coefficient extension over k’ or over F’, for, over any larger coefficient field 
YW > F one could again apply the above argument about degrees to show that 
[K’(28):k’(WB)] = n°; this implies that no proper extension YW could be con- 
tained in K’. 

The group D of (arithmetic) divisors which become principal in K’ is just the 
group generated by the divisors A; of (x — a)", i = 1, --- ,s — 1, according 
to the Corollary of Theorem 5. By Theorem 5, this group D has an index 
[D:(F’)] = [K’:F’] = n*". In the original field F, the group of divisor classes 
generated by P,/P, --- , P..1/P therefore has the required structure. 

This theorem shows that the consideration of non-Abelian extensions is essen- 
tial. For suppose that the coefficient field § of the theorem does not contain 
all n-th roots of unity. Since there nevertheless may be divisor classes of 
order n, a normal extension of F in which such a divisor class becomes principal 
cannot be Abelian, by the first Corollary of Theorem 15. Any such normal 
extension will necessarily involve a coefficient extension. 

The case when the coefficient extension § is finite is especially interesting, in 
view of the class field theory. For any given integer n, there will exist a prime 
p 2n-+tsuch that p #1 (modn). The finite field § of p elements will then 
be one to which the construction of the theorem applies. By the existence 
theorem of the class field theory (Witt [20]), there will exist unramified Abelian 
extensions of F with a degree divisible by n. In none of these extensions will 
the divisor classes of order n in the field become principal. 


HARVARD UNIVERSITY AND THE UNIVERSITY OF CHICAGO. 


BIBLIOGRAPHY 


1. A. A. ALBERT, On p-adic fields and rational division algebras, Annals of Mathematics, 
vol. 41(1940), pp. 674-693. 





21 


ics, 








2. 


w 


21. 





A GENERAL KUMMER THEORY FOR FUNCTION FIELDS 167 


E. Artin, Idealklassen in Oberkérpern und aligemeines Reziprozitdtsgesetz, Abhandlungen 
aus dem Mathematischen Seminar, Hamburg, vol. 7(1930), pp. 46-51. 


. R. Bakr, Abelian fields and duality of Abelian groups, American Journal of Mathematics, 


vol. 59(1937), pp. 869-888. 


. A.H.Cuirrorp ann 8S. Mac Lang, Factor seis of a group in its abstract unit group, Trans- 


actions of the American Mathematical Society, vol. 50(1941), pp. 385-406. 
D. van Dantzic, Nombres universels ou v!-adiques avec une introduction sur l’algebre 
topologique, Annales de |’Ecole normale, (3), vol. 53(1936), pp. 257-307. 


. Max Devurina, Zur arithmetischen Theorie der algebraischen Funktionen, Mathematische 


Annalen, vol. 106(1932), pp. 77-102. 


. Pu. FurRTWANGLER, Beweis des Hauptidealsatzes fiir die Klassenkérper algebraischer 


Zahlkérper, Abhandlungen aus dem Mathematischen Seminar, Hamburg, vol. 
7(1930), pp. 14-36. 


. H. Hasse, Theorie der relativ-zyklischen algebraischen Funktionenkérper, insbesondere 


bei endlichem Konstantenkérper, Journal fiir Mathematik, vol. 172(1935), pp. 
37-54. 

8S. Mac Lang, A lattice formulation for transcendence degrees and p-bases, this Journal, 
vol. 4(1938), pp. 455-468. 

S. Mac Lane, Modular fields. I, Separating transcendence bases, this Journal, vol. 
5(1939), pp. 372-393. 


. 8. Mac Lane anp O. F. G. Scurtuine, Normal algebraic number fields, Transactions of 


the American Mathematical Society, vol. 50(1941), pp. 295-384. 


. 8S. Mac Lane anp O. F. G. Scutiuina, A formula for the direct product of cross product 


algebras, Bulletin of the American Mathematical Society, vol. 48(1942), pp. 
108-114. 


. E. Norruer, Der Haupigeschlechtssatz fiir relativ-galoissche Zahlkérper, Mathematische 


Annalen, vol. 108(1933), pp. 411-419. 


. O. F. G. Scuttimne, Regular normal extensions over complete fields, Transactions of the 


American Mathematical Society, vol. 47(1940), pp. 440-454. 


. F. K. Scumipt, Analytische Zahlentheorie in Kérpern der Charakteristik p, Mathe- 


matische Zeitschrift, vol. 33(1931), pp. 1-32. 


. F. K. Scumipt, Zur arithmetischen Theorie der algebraischen Funktionen, I, Mathe- 


matische Zeitschrift, vol. 41(1936), pp. 415-438. 

. Scumipt, Die Theorie der Klassenkérper tiber einem Kérper algebraischen Funk- 
tionen in einer Unbestimmten und mit endlichem Koeffizientenbereich, Sitzungs- 
berichte der Physikalisch-Medezinische Sozietit, Erlangen, vol. 62(1930), pp. 
267-284. 


- 
va 


. A. Speiser, Zahlentheoretische Sdtze aus der Gruppentheorie, Mathematische Zeitschrift, 


vol. 5(1919), pp. 1-6. 

E. Wirt, Zerlegung reeller algebraischer Funktionen in Quadrate. Schiefkérper ber 
reellem Funktionenkérper, Journal fiir Mathematik, vol. 171(1934), pp. 4-11. 

E. Witt, Der Existenzsatz fiir abelsche Funktionenkérper, Journal fiir Mathematik, vol. 
173(1935), pp. 43-51. 

H. Zassennaus, Lehrbuch der Gruppentheorie, Hamburger Mathematische Einzel- 
schriften, vol. 21(1937). 





ABSOLUTE NORLUND SUMMABILITY 
By Leonarp McFappENn 


1. Introduction. The method of summability considered here was first intro- 
duced by Woronoi [11], but is more closely identified with the name of Nér- 
lund [8]. Accordingly, we will use the term ‘‘Nérlund transformations” to indi- 
cate the transformations in question. 

Let {pn} be a sequence of constants, real or complex valued, and let {P,} 
denote the sequence of partial sums. We will call {¢,} the Nérlund transform 


of an arbitrary sequence {U’,}, where 


(1.01) ee od 

v=0 } 
it being assumed of course that P, # 0. The sequence {U,} is said to be 
summable by the Nérlund mean N, defined by {p,}, or summable N,, if 


lim t, exists. ° 
n-?o 


The conditions for regularity of such a transformation are 


(1.02) lim P. 0, 


IA 


(1.03) > | m| <= C| Pal, 


where C is a finite positive constant. It is easily seen that (1.02) is equivalent to 


(1.04) en =? = 1, 


n-?o n 





We might also note that, if p, is real and non-negative, condition (1.03) is 
satisfied automatically and, if in addition p, is non-increasing, condition (1.02) 
is also satisfied. 

The sequence {U,} will be said to be absolutely summable by the Nérlund 
mean defined by the sequence {p,}, or summable | N, |, provided that 


(1.05) D | in — tral $C < @. 
n=l 


If we let N, and | N,| denote, respectively, the class of sequences summable 
N, and | N, |, we have the following 


N,| CN,;N, ¢ | N>|- 





(1.06) THEOREM. 


Received November 3, 1941. 
1 Numbers in brackets refer to the bibliography at the end of the paper. 


168 








See ee OOD 


if 


)2) 


nd 


ble 











ABSOLUTE NORLUND SUMMABILITY 169 


Proof. To prove the first relation, we observe that, since {U,} is summable 
i) 


|N,|, then >> | t, — tr] = C < ©. Therefore, for given « > 0 there exists 


n=(0 
a sufficiently large positive integer N such that for any g > 0 we have, when 
n+q n+q 
n>N, >. | tm — tma| <. It follows that | tay, — ta| = | D> (tm — tm) | S 
m=n m=n+1 


n+q 
> | tm — tma| < €. Hence, t, converges to some limit ¢. This is, however, 
the condition that {U,} be summable N,. 

To prove the second relation we exhibit the following example. Let p, = 1, 
n = 0,1,---, and let UV, = (—1)",n = 0,1,---. Then 


— - for k even 
= k+1’ 
(0, for k odd. 
Clearly t converges to zero, but 
( 1 
—— . for k even, 
ik+1 
li — tual = 1, 
i=, for k odd. 
\k 
Hence, >> | t — t1| diverges and our theorem is proved. 
k=1 


Since there is no loss of generality in considering pp = 1, we shall do so through- 
out and, unless otherwise stated, it will be assumed that the transformations 
considered are regular. 


2. Inclusiveness relations. Kogbetliantz [6] proved that a series absolutely 
summable by the Cesaro mean of order a is also absolutely summable by the 
Cesaro mean of order 8 > a. It is well known that the Cesaro transformation 
is a particular case of the Nérlund transformation. Thus we are led to the 
problem of determining what conditions must be imposed on the sequences 
{pn} and {q,} in order that a sequence summable | N,| will also be sum- 
mable | N, |. 

We will first state the following result due to Florence Mears [7]. Given the 


. r yf + , yf 
matrix || dae ||, let Un = Do ue, Un = Do aul, and u, = U, — Urr. 
k=0 k=0 


(2.01) THrorem. The necessary and sufficient conditions that >> | u, | converge 
n=0 


7 
whenever >. | un | converges are 


n=() 


(1) >> an, converges for all n; 
k=0 


(2) , > > (Gnv — Gn-w) | SC < @ forallk. 


n=0 | vk 








170 LEONARD McCFADDEN 


Now consider 


(2.02) b= > Pet 
and 

(2.03) a > Qn—v Us, 
then 

(2.04) es > Sagi, 


where FR; is obtained from the following system of equations obtained by equating 
the coefficients of U, in the numerators of the right hand sides of (2.02) and 
(2.04): 

Po = who, 


(2.05) mM=nkh+ oh, 


ee 


Pe = Meo + --- + qoRx. 
Adding, we get 


(2.06) Pr = QeRo + «++ + QoRe. 
If we write 

(2.07) p(x) = po + pie +--+ + pat” +---, 
(2.08) q(t) = q@ +o tes tan” +, 
(2.09) Riz) = Ro +Rxet+--- +R ez" +---, 


it is easily seen that 


(2.10) P=) _ (2), 
q(x) 


Wherever they appear, it will be understood that p_,. = P. = R. = 0. 


(2.11) THrorEeM. A necessary and sufficient condition that |N,| C|N,| is 








> pa (Fxe0 - Rr—v-1 *) | < C < for all k. 
Pe Pont 


Proof. The matrix || R,.Q./P, || satisfies the conditions of Theorem (2.01), 
due to (2.06) and the hypothesis. 


n=( v=k 








-_ ae « a. 


ng 
nd 


1), 











ABSOLUTE NORLUND SUMMABILITY 171 


(2.12) TurorEem. Sufficient conditions that | N,| C | Np | are 


(1) ¥(=;9 v Yo — Beet) > 0, k=0, . non ie 
yank | 

a2) > a Q. <C<« for all n. 
v=0 n 


Proof. It suffices to show that the condition of Theorem (2.11) is satisfied. 


SS (Rr Qe Rn—v-1 Qe R.A Binal 
Sidt )|- 5% (Bs® - 222), 


et Pet n=0 v=k 
" R “IR by (2.12), (1), 
an p> a Q, < p> P. Q, < C <x 


for all N, by (2.12), (2). Therefore 


x0 n 
b (#5 Q, i B=) | < C < w for all k. 
n=0 | vk Fe Pra 


(2.13) Turorem. If (1) {P,} and {Q,} are both non-negative sequences, (2) 
> (\Rn—» | Qo/Pn) S& C < & independently of n, and (3) there exists a positive 


integer N such that (Rm-a/Pn+) — (Rm/Pn) 2 0 whenever n 2 m 2 N, then 
N,| C|N>| 


Proof. Again we show that the condition of Theorem (2.11) is satisfied. First, 


= | = Q» aaa | Q» | 
»» ( Pe Pant ) | 











n=() 
(2.14) Re oQe Rava [Pf Binns@ 
= n—v Pe... n—v—l v eae ee n—v—1\ v 
-> > ( Ps Pant ) + 2, > ( Ps Pant ) 
and 
~ Ro~0 Q, _ Rano ) 
7 Ne cal 
—! Pp 
= 5% n— oQ» _ Fh n— ate oat Rr» 1Qe + 5 Reis Sots 
v=0 _ v=0 v=0 Pai v=0 Fun 
(2.15) - me 
=l1- > Rn—w Qo _ 1+ >. navy by (2.06), 
on Te om 





n—v—l _ Riv 
7 > Q- (4: Ps ). 





172 LEONARD MCFADDEN 


Now 
> oo < > Q» i < C, 
(2.16) a 1 Q - “er at 
| —- 
2, 5 n—v < p> v p. n—v < C, 





by (2.13), (1), and (2.13), (2). Therefore, 


WY) 2 /R,_.Q, ay 
} > ( P. Par | 


n=k | v=k 
LRT} oe n—v— n—v | 
steele =) rs (&- oe . )). by (2.15), 
nek | yO I a—l om | 
<(N+1)2C <= for all k, by (2.16). 


When n >k +N +1, it is clear that n — v 
(2.13), ( 3), (R,- —v—l Pe -1) (R,- Af) = 0 for 


= Ri Qe _ Ri v 10) . = (te a =) 
yD ( Pe Pe 1 > Qe | P, 2 ° 


© IV 


N +1+&k — v, and so, by 
= 0,1,---,k. Hence, 


v=k v=0 
and 
~ - R...0@, Reese 
fiw veek ( P, : . Pia | 
a = - (#20 ae Be-~-1@) 
n=k+N+1 v=k Pe Po-t 
(2. 18) M N+k 
= Si RuiQe _ 5H Rosen, 
vemk Py v=sk Pwsx 


vw R ™ | N+k | Ry = | Q» 
< | M—v Q» +k—v < 2C, 
a > Px +2 Py+k 


independently of M, N andk. Therefore, applying (2.17) and (2.18) to (2.14), 


we get 
oxo Ge — Rawr *) 
¥ | (Be - Be 





< 2C(N + 2) < ~, 


n=0 | v=ek 
independently of k. 


(2.19) Tusorem. If (1) pp is non-negative, (2) Qn is non-negative, (3) Q,./P. 
C < « and (4) there exists a positive integer N such that R» is non-negative and 
non-increasing whenever m > N, then | N,| © | Np|. 


Proof. We will show that the conditions of Theorem (2.13) are satisfied. 














S llA 


p 








ABSOLUTE NORLUND SUMMABILITY 173 


First, P, = ) m pr 2 OandQ, 2 0. Second, 
k=0 


S| Bar| FF ae, [Reel = og ew sn, 


v=0 Pe v=0 v=n—N+1 P, 





Ri» Q, + > | Rn—v | Qo 








=1—- 
v=n—N+1 Ps v=n—N+1 Pe 
<1+2 > Bool 


<1+2 max |R| > Qs 


Lb. 
O<k<N ven—N In 


= 1+ 2C max | Ry) &, by (1.04), 
O<k<N P,, 
<C'< @, 
since N is finite. Third, 
oe Re " m— m 
eft dale nde 
>0 form > N. 
Thus our theorem is proved. 
(2.20) THrorem. If —1 <a S 8B, then|C,a| C|C, 8B). 
Proof. Clearly there is no loss of generality in assuming y = 8 — a@ S 1. 


It is well known that the Cesaro transformation (C, 8) is the Nérlund trans- 
formation N, , where p(x) = (1 — z)*. Now 





_ plz) _ (l— 2)? _ — 
Riz) = qa) @—z)* (l— 2)”. 
Hence, 
(2.21) rp, = Mt): ta-D 


n! 


which is non-negative and non-increasing, since 0 < y S 1. 
Case 1. 820. In this case we show that the conditions of Theorem (2.19) 


are fulfilled. 








(1) p= B(8 + 1) ane + = 1) > 0, since 8 = 0. 
(2) Q, = @t Ihe + 2) --- e+) 9 since a > —1. 


n! 





174 LEONARD McFADDEN 


(3) Qe = @t+ Wat 2) ++ (atm), 
P, (8+ 1)(6 +2)--- @+n) 


(4) R, is non-negative and non-increasing by (2.21). 


since a < 8. 


Case 2. —1< 8 <0. In this case it will be shown that the conditions of 
Theorem (2.13) are fulfilled. 


(1) P, = (6 + 1)(6 + 2) --- (B6+n) >0 


- ' since 8 > —1; 


g, = e+ I)(a +e St ye since a > —1, 

(2) p> =r |Q on > aaeS since R,, P, and Q, 2 0, 

= |, by (2.06). 
(3) | _ Riv on aaa | Fe a Ri» ) 
; P,-1 Fe Fe | 

 Ranent (? +n _ytn-v— ') 

Pa n n—v 

i Baxeut . 

~ n(n — v)P, (na — B+ 9) 

= 0, since a + 1 2 8. 


Thus our theorem is completely proved. 
Let us write Ap, = Pn — Pn-1, and Ap, = A(A*""p,). 


(2.22) Turorem. If (1) p, = 0, (2) n'/P, = C < ~, (3) A‘p, = 0 and (4) 
A‘*'p, < 0, then |C,k| C|N,|, & = 0,1, 2,---. 


Proof. It will be shown that the conditions of Theorem (2.19) are satisfied. 


(1) p, 2 0, by (2.22), (1). 
(2) Q. = (k + 1k + 2 (+n) > 0, for k = 0, 1,2, --- 
3) @<c™ <« by (2.22), (2) 
ae Sad , . 


(4) R(x) = me = (1— 2) plz) = 1+ A‘ pie +--+ + Ai paa" + +>. 
Therefore, R, = A‘p, = Oand R, — R,. = A**'p, S 0. 
(2.23) Tueorem. If N, is the transformation defined by the sequence {q"} and 
N, is the transformation defined by the sequence {p"}, whereO0 Sq Sp S31, 
then |N,| C | Np|. 











s of 


ed. 
1), 











ABSOLUTE NORLUND SUMMABILITY 175 


Proof. The conditions of Theorem (2.19) are satisfied. 
(1) p= p"20. 


(2) Q=Lg zo. 
(3) Qn <= 1, sinceg S p. 
_ P(x) _ (1 — px)” 
) RO) = W@ ~ ae 
=1+(p—gat-:+ +p" (p— qa" +-. 


- —j P ° . . ° 
Therefore, R, = p" (p — gq), which is clearly non-negative and non-increasing, 
sinceeO SqSpesl. 


(2.24) Turorem. If N, is the transformation defined by the sequence {q"},0 S 
q < 1, and N, is the transformation defined by {1/(n + 1)}, then| N,| C|N,|. 


Proof. The conditions of Theorem (2.19) are satisfied. 


(1) pra = — > 0. 


(2) Q=Dq@>O0. 


g) 2. = — CEC < w, 
" Vke+ 1)" 
k=0 


4) Ra) ="@ = -m(it5+F4--). 


q(x) 
Therefore, 
“ l o fi a. a 
z= (=) (2) 20 siete OT" 
and 
im ae, oe ee f —. 2 
R, ees n sa3* ae ** iit 2S 


(2.25) Turorem. If N, is the transformation defined by qn = 1/(n + 1), then 
|\N,| C|C,a| fora > 0. 


Proof. Due to Theorem (2.20), we can clearly assume 0 < a S 1. Again 
we show that the conditions of Theorem (2.19) are satisfied. 











176 LEONARD MCFADDEN 
(i) po SOT eter) since a > 0. . 
n! in 
n 1 € 
(2) Q, = 0. \4 
)Q > ssi? . 
(3) Qn s¢ 82 s0<«. 
I a n@ 
(x) a(1—2x)* 
4) Rs) o Bt? a ~ BD? 
(4) R@) q(x) log (1 — =) 
Now we observe that 
. ™ (aii—zy - —2x ' 
(l-—2z) d= = “———— 
0 log (1—2)_Jo log (1 —2z) 
Therefore, 
1 1 
Riz) = - 2 [ (l1—2)’ d= [ (1 — x)” * dz. 
0 0 
Hence, ” 
(—1)" f' 
R, = (g-—al(z¢-a-—1)---(¢-a-—-n+l1)ad 
n! Jo 
- matt (a—zl(at1l—z)-:-(atn-—1l—zd& 
n! \Jo 
1 \ 
-| (z-aVla+1—2z)+--(a+n-—1 — 2) dey 
1 (f pa/2 A 
. ae — 9 == 9) ccc ome un 
(2.26) =F a (a — z)(a+ 1 — 2) (atn—1l—2z)dz 
\ 
— | @-aat1—2) ++ (@+n—1-2) ae) : 
lfa a a a b 
> 4g S (1+ $)--(n- 14 $) - Gaya - a) ; 
X1-2--(n—w} ( 
ti 
> 0, 
for sufficiently large n. Next we observe that E 
p(x) z—a : 2—a+l 
(1 — z) =(l-—2)/ (l-—z)“*d= (1 — zx) dz, ( 
q(x) 0 0 ) 
and, in a manner similar to that above, we find t 
(2.27) R, — Ria < 0, (: 











> 0. 


ABSOLUTE NORLUND SUMMABILITY 177 
for sufficiently large n. Thus, by (2.26) and (2.27), R, is non-negative and non- 
increasing for sufficiently large n, and our theorem is proved. 


(2.28) THErorem. [f (1) q, is non-negative and non-increasing and (2) Qn+1/Qn 
is non-decreasing, then |N,| C|C, 1}. 
Proof. Again we show that the conditions of Theorem (2.19) are satisfied. 
(1) po =1>0. 


(2) Q. = Da > 0. 


k=0 


(3) gn is non-inereasing and gq = 1; therefore, Q, S n + 1. However, 
P, = n + 1, and so clearly Q,/P, S 1. 
pe)_ (’-2)y7_ a 
q(z) q(x) (1 — x)q(z) 
: 1 
l—ar—ea—...’ 
where C, = Gn-1 — Qn 2 0. Hence, 


Ro =1>0, 


(4) R(x) 


Also, 
1 — vr) , 
(1 —2)R(2) =GO M0) La tae tat te. 
q(x) q(x) 
Kaluza [9] proved that if f(x) = ao + mx +--+ + a,x" + --- and 1/f(z) = 
bo + bx + --- + b,x” +--+, where a, is non-negative and a,4:/a, is non- 


decreasing, then b, is non-positive. Thus it follows that R, — Rn. = d, S 0; 
hence R, is non-negative and non-increasing. 


(2.29) Turorem. [f (1) p, is non-negative and non-decreasing and (2) Pn4i/Pn 
is non-increasing, then |C,1| C|Np|. 


Proof. We will show that the conditions of Theorem (2.12) are satisfied. 


First, observe that 


(2.30) R(x) = p(z) — = (1 — z)p(z); 
(1 — z)** 


therefore, 


(2.31) R,, = Da, =~ Da-i - 











178 LEONARD MCFADDEN 


Now (1) 
‘ Ri» Q, = Ra~o-a +) — : (Pn—v id Dn—v—1)(V + 1) 
> ( Fe Pi = x e Pe 
— (Prot = Pr—ea)v + Dy by (2.31), 
n—l 


(2.32) 


kpn+-a + ita 





_ kpn—e + Pant » 


P,, 


From (2.29), (2), it is easily deducible that 
Pant = | a 


(2.33) Pp. P 





Thus, if 


Pot _ Prt-t 5 9 
Pa Pet = ’ 


then, due to (2.33), the expression on the left hand side of (2.32) is non-negative. 


Suppose on the other hand that 


Pn—k Pn—k-1 
_ < 0. 
P. Pe-t 
Then 
Pn—-k Pn—k-1 Dn—k-1 Pn—m Pe - 
P. ~ | = Fe (P= 7 Ff), aa ° = 
> Pn—m Pn—m-1 
_ Pe Pe-t 


Therefore, 


and, substituting in (2.32), we get 


n Ri» Q» ie 
> ( P. Pax! m=k—1 


= 0. 
(2) R, 


IV | 
o 3 
| 
3 
3 


Therefore, 





| 2) > > Pn—m + Pr i 
= P,, 


Pot 
= Dn—k-1 Pr _ Pra 
P.-1 ) + P. Pant j 


=) 
Pe 1 ‘ 
- Pn—m-1 _ Pondxt 
ee, Pe 1 Poxt 
by (2.31), 
by (2.29), (1). 
; by (2.06). 





< k, due to (2.29), (2), 





be 


ive. 








ABSOLUTE NORLUND SUMMABILITY 179 
Thus our theorem is established. 


3. Abel’s transformation. The formula 


n n—l 
(3.01) > ULE = > Urlve — vex) — Unrate + Unta, 
k=m k=m 
wherreO0 Sman, Up, =u +u+-+: + wu, if k 2 0, U_, = 0, which can 


be verified, is known as Abel’s transformation, and will be used extensively in 
what follows. 


(3.02) CorouuarRy. If vn, Um4i1,+** , Un are non-negative and non-increasing, 
the left hand side of (3.01) does not exceed 2v, max | Ul’, | in absolute value. In 
m—l<sksn 
fact, 
| nn (n—l \ 
D wre} S max | Ue! 4 YO (ve — ves) + tm + 0nd 
k=m | \i =m ) 





= 2v, max | U; |. 


}. Absolute Abel summability. The series >> u, is said to be summable by 


n=O 
Abel’s method, or summable A, to sum U’ if the expression 
(4.01) f(x) = wo + mats + unr" +--- 
is convergent for |x| < 1, and 
eo io) 
lim f(x) = lim u,x* = lim (1 — 2) pe U,2* = U, 
z—1 z—1 k=0 z—1 k=0 


where x tends to 1 along the real axis. 
i 2] 
Following the definition of Whittaker [10], we will say that the series ph Un 
n=0 


is absolutely Abel summable, or summable | A |, if f(x) is of bounded variation 


1 
on (0, 1], that is, it [ f(z) | dx exists. 
0 





oo) 
Fekete [3] showed that bm u, is summable | A | if it is summable | C, r | for r 


n=() 
& positive integer. 
Bosanquet [2] extended Fekete’s result to include non-integral values of r. 
We now prove the following 


(4.02) . ToeoremM. Let N, be a regular Nérlund transformation and let us write 
on(X) = b P, a" » P, x". 
k=0 k=0 


If (1) py u, is summable |N,| and (2) the sequence {¢,(x)} ts uniformly of 


n=0 


oo 
bounded variation, then >> u, is summable | A |. 


n=0 








180 LEONARD McFADDEN 


Proof. We will write P(x) = >> P,«" and R(x) = Dd Patpc". Clearly, 


n=? n=O 


R(x) = pr ee pu,e" = f(x)P(x); hence, f(x) = R(x)/P(x). Since N, is 


n=0 v= 
regular, we have lim (P,:1/P,) = 1, so P(x) has a radius of convergence 1. By 
ed] 
4 
hypothesis, >> | ¢, — t,..| < C < %; hence, ¢, approaches a finite limit, and 
n=0 


thus R(x) has a radius of convergence 1. Therefore, we can write 


- P(2)R'(x) — P'(x) R(x) 


f(x) P2(a) 
P(z) > nPatat”  — P(x) p> P,t.2° 
i =) Big n=0 
P(x) > > kPyx* "(th — trys) — P(x) > > Pya'(tr — tuys) 


= P 


P?(z) 
by (3.01) (Abel’s transformation), 
" > [P(x)d(Po + +++ + Pax")/dx — P(x)(Po+ +++ + a —e 
=i P*(z) 


= F hele — tu). 


n=( 


Therefore, 


1 C) 1 
[ir@lar sO | || dele. - toa 
0 n=0 “0 


lA 


C > | tn sas tn4i |, by (4.02), (2), 
< ©, by (4.02), (1), 


0 
which is the condition that >> u, be summable | A |. 


n=0 


(4.03) THrorem. [f (1) N, is regular, (2) P, 2 0 and (3) ; U, is summable 


n=0 


| N,|, then >. un is summable | A |. 
n=O 
Proof. It is clear from (4.02) that we only need to show that {¢,(z)} is uni- 
formly of bounded variation. Now 


n(x) a{y P,x* > Pas’ 


dx \k=o k=0 


oo 0 2 
> vast / (x P, “) 
k=0 k=0 





d 











ABSOLUTE NORLUND SUMMABILITY 181 


where 
Vi = PyPy + 2P2Pia + +++ + mPpPngs — (k + 1)PoP ea 
+ kPiP +--+ (k —n+1)P,Pi-nu, 


it being understood that P, = 0 when r is negative. If we consider separately 
the cases0O Sk S n,n <k S 2nandk > 2n, it is easily seen that in all cases 
V. < 0. Therefore ¢,(x) < 0 for all n and 0 < x < 1. Hence 


1 1 1 
[ \ei@ jar = -[ oe)ae = -oe)[=1 for alln 
0 0 0 
Thus our theorem is established. 


5. Fourier series. In this section we will be concerned with a function f(z) 
of period 2x and belonging to some class L’, g = 1. If the Fourier series S(f) 
and the conjugate series S(f) of the function f(x) are respectively 


dag + > (a, cos kx + by sin kx), 
k=1 
(5.01) ” 
> (a sin kx — b; cos kr), 
k=l 


we can write for the n-th partial sums S,(f, x) and S,(f, x) of these series 


S,(a) — f(x) if o(t)D,(t) dt, 


(5.02) 


3,(2) -* l V(t)Da(t) dt, 


where ¢(t) = ¢.() = f(x + ) + fix — t — 2x), Wd) = v(t) = 
fiz + ) — f(x — 9), 


sin (n _ se £08 5 — cos (x a 5)e 
phe \s., sme —2 12. 
2 si : 2 si : 
sin 2 sin 5 


Bernstein [1] showed that, if f(z) belongs to Lip a, the Fourier series of f(z) 
converges absolutely for all values of z when a > 3, but not necessarily when 
a S 4, Later Hyslop [5] extended Bernstein’s theorem by showing that, if 
f(x) belongs to Lip a, where 0 < a S 3, the Fourier series of f(x) is summable 
| C, B| for all values of x if a + B > 3. 

Our object will be to extend the result of Hyslop to the case of summability 
| N,| for certain types of Nérlund means. Analogous theorems for the conju- 
gate series will be established at the same time, and certain other results obtained. 


(5.03) Notation and lemmas. We will write p(y) = py), and P(y) = Py), 
where [y] as usual denotes the greatest integer less than y. 





182 LEONARD MCFADDEN 





(600) he Pee, = pe 


v=(0 


(5.05) (h) = l | o(t)| P(C) dt, Wh) = I | y(t) | PC) aft. 


20 


(5.06) a(t) = p> px cos kt. 
(5.07) a(t) = Si msin kt. 


k=0 


(5.08) a, = [ o(t)a(t) cos nt dt, ain = [ ¥(t)a(t) sin nt dt. 


(5.09) s. = [ ¢(t)8(t) sin nt dt, 3, = [ ¥(t)B(t) cos nt dt. 


Furthermore, A will denote an absolute positive constant, and we will write 
A+Az=AandA-A =A. Byf, ~ gn, it will be meant that there exist 
two constants A; > 0 and Az > 0, such that A; S f,/g, S Az for n sufficiently 
large. 
(5.10) Lemma. [If p, is non-negative and non-increasing, then t p(t) < P(t"). 
Proof. Px. = po + pi +++ + Pn = (n + 1)pn. Therefore, t"p(t") s 

(7) + Dpu-y S Pu- S P(E’). 
(5.11) Lemma.” If pp, is non-negative and non-increasing, then, forO S a < 
bs ~,0 St S rand any n, we have 

b P(t”), for any a, 

i(n—k)t 
Do Pre =), 4 - 
kaa At Da, fora 2 {t"). 





Proof. Let + = [t"]. Then 
b 


>» Pp etn He 
k 


k=a 


| b 
int —ikt 
= |e" D) me 
k=a 











but 
r—1 r—l 
> me ™ | < > om s P s P(t"), 
k=a 


| k=a ' 


and, by (3.02), 


b 1 ~i(k+1)t 
—ikt = ¢ 
DL pe “'|<S2p, max |———_ | 
ket rtisksd| 1l—e 
itl? 


S 4p: | aan — an 
S 2p,(1/sin 3¢) 
< At p(t”). 


2? This lemma is due to Tamarkin and Hille. 











IIA 








ABSOLUTE NORLUND SUMMABILITY 


183 


The lemma follows immediately, since, by Lemma (5.10), f'p(f") S P(t”), 


. —1 
and, in case a = [t ], we would have 
D —i(k+l)t 
—ikt 1 = @ 
Dd pee S 2p, max | ———_,,, 
k=a ask<sb l1-—é 


< At'p 


(5.12) Lemma. If p, is non-negative and non-increasing, then 


(Po+ a ds os *) < P(t") for t 
Proof. 
a + Pe) 5. < (n+ l)pn S Pr S Plt) for ¢ 


IIA 
si— 


lA 


1 
zs 


(5.13) Lemma. If (1) pn is non-negative and non-increasing and (2) Pn — Pn+i 


is non-increasing, then 


(n + 1)'(Pn — Pnti) rl 
Pe i 
Proof. 


(n + 1)*(pn — Pass) S A (mat v) (Dn — Pui) 


=A > (k + 1)(pn — Pnsa) 


<A aC +1)(pi — pers), by (5.13), (2), 


= A{P, — (n+ 1) pass} 
* if a. 


(5.14) Lemma. If p, is non-negative and non-increasing, then 


— k( pe — Pr+1) < A 


i oe a } 








sina 
— k( pe sis Paya) _ Pn — ba Pn+1) (n+ + D(patt —_ Post) 4 


fa Faas P,Pr-a Pauke . 


= _™Pn_ 4 5 Pei (5 Fos 7) 
Pal e-t bean Pp Pra = 
NPn + > *Bus( 5} 


_ _" kpess (1 1)+e ae 
Palos k=n P, Pus P,- 1 k=n P, Pes 





< NPn = _ Pk < A a 
s p+ bs + SB 























184 LEONARD McFADDEN 
(5.15) Lemma. If p, is non-negative and non-increasing, then n™'P,, < tP(t™) 
fori/nstsr. 


Proof. 
Pe _ Pass _ Pe = Pn > 0. 


n n+1 n(n + 1) 


Hence P,,/n decreases as n increases and the result follows. 





(5.16) Lemma. Let N, be a given Nérlund transformation with p, non-negative 
and non-increasing, lim p, = 0 and | Ap, | non-increasing. Suppose also that 


n-?o 


o(t)P(t") belongs to class L‘, g = 1. Then 


= 1) 
(5.17) Dis—wsl salt D 7S. 2 + & loots [Bn |\ 


mM 


n=l Peni l n n—l ) 


Proof. 
Prk _ Pn—k-1 
rt, = tp 1) = [ ¢(t) > *( P,, P,1 ) D,{t) dt 


= [ ¢(t) > ( = =) cos kt di 
0 k=1 } 


[ ¢(t) e CG - ze) cos (n — k)t dt 


k=0 Fa 1 


[ o(t) > (Pi Pas — PP) cos(n — k)t det. 
Fal | 0 k=0 


Therefore, 


® n—l 
a(t, — tra) = B. [ ¢(t) > (pePn — PnP) cos(n — k)t dt 
Pa Pot 0 k 


[ ¢(t) IF», P.— > mPrx -¥ Dn P,| cos (n — k)é dt. 
~ ?P, p 1 \e k=n k=0 
Hence, 
tltn—tri|l S . [ o(t) > pr cos (n — k)t dt 
Pant 0 k=0 
I/n 
p|f ¢(t) ps p, cos (n — k)t a 
l/n n—l 

+ ae I ¢(t) p>: P; cos (n — k)t dt 

r 20 n—1 
+ / #(0){ X ps co (n—kt+ > 2 P, cos in — byt a 

P,-1| 1/n k=n k=0 Pa 


= I,(n) + I,(n) + I3(n) + I,(n). 








pA? 


€") 


tive 
that 


t dt. 














Now 


I,(n) 


I(n) 


I;(n) 


In) 


ABSOLUTE NORLUND SUMMABILITY 185 


= : | o(t) > p. cos (n — k)t dt 
Pra 0 k=0 


° (2 20 
_. [ o(t)< >, px cos kt cos nt + >> py sin kt sin nt dt 
P,-1| 40 \S k=0 ) 


=e 1 - : 1 © ; | 
oe I $(t)a(t) cos nt dt| + - [ (t)B(t) sin nt dt 
_ | On | - | Bn | 


Pant 
1 I/n 00 
= [ o(t) >> pe cos (n — k)tdt 
} a 0 k=n 








J l/n 

< F. | p(t) | P(t) dt, by Lemma (5.11), 
n—1 “0 
&(n~") 

=A ; by (5.05). 

Pi : 

p I/n n—l 

= PoP l ¢(t) xX P, cos (n — k)t dt 

(Po +--+ + P2)Pn [ ' 

< 

< PP. ; | p(t) | dt 
1 I/n 

i 5- | p(t) | P(t") de, by Lemma (5.12), 
n—1 40 


@(n™")/Pr1. 
1 r ( @ n—1 p | 
- [ ot)) 2 p, cos (n — k)t + y <* P;, cos (n — bye a ! 
n—1 n 


k=n k= n 


Applying Abel’s transformation we obtain 


> m cos (n — k)t = pn cos 0 + pay COS t + Pny2 CoS 2t + --- 


ken 


al 


n 
k=O 


5 + (pn — paar) + (Past — Pase(t + cos t) + --- 


ats Die x sin (n — k + 3)t 
a + p> (Pe — Pass) 2 sin 4¢ , 


> P;.cos (n — k)t = Po cos nt + P; cos (n — 1)t + +--+ + Pa; cos ft 


= (Pri — Pr2)(} + cos t) + --- + (Pi — Po) 
x (4 + cost + «++ + cos (nm — 1)t) 
+ Pol} + cost + +++ + cos nt) — $Prr 
- Fp, m= k+ Mt yp 





k ;. 
k=0 2 sin $¢ 








186 LEONARD McCFADDEN 


Hence, 


1 | ¢” _ ot) 
a - 
In) s Po | Jam Bein Jl x (pe — Peri) sin (n — k + 4)tdt 


pm |f" 02) F, Ge 1 
aa —k+ 3)tdt 
* PP in Dain Fe 2g PE SID (nm k + 3)t¢ 


1 Pn Pict . 
oP. -(1 =) I/n H(t) dt 


= Igi(n) + Ig2(n) + Iya(n). 





We observe that 


x (Pe — Pei) sin (nm —k + 3)t = 3{3 (pe — Peri) exp (i(n — k + 3 po}, 


kon 


le . - 
and n 2 ¢ in the interval [1/n, =e so that, by Lemma (5.11), we have 


A (Dn jan Pn+i) | p(t) | | i dt 


Iain) $ FP ijn Sint 
n—l n § 
n( Pn — Dns) | b(t) | 1 Pr 
= t dt 
A Fak ent ijn Sint n 
i r | 
<A M(Pn — Pris) / o(t) ! P(t) dt, by Lemma (5.15), 
Foal «xt in sint 
< A npn — Pn+1) [. | (t) | P(t) «7 dt. 
n—l 


PF, 
Applying Lemma (5.11) to J42(n), we get 


Iyo(n) S (Ap, /Pa P..) | | o(t) | P(t) 7" at 
l/n 
On integrating by parts, we see that 


/ "| 9(t) | PU) at ra | + | " er? at 
l/n l/n 1/n 


IIA 


n 1/k 
5 O(n) + n&(n) + ¥ Ok) | dt +A 
k=1 1/(k+1) 


< A+ n®(n) + D Ok). 
k=1 
Therefore, 
C-) 2 
< npr — Pow! n n (Dn — Pn+1) -1 
¥ taln) 5 {ye ie? pg yp ee r. ? 


+ > =p, Pose " Pai) > > ™ &(k- »}. 


n=l n—l k=l 











Ld 











ABSOLUTE NORLUND SUMMABILITY 187 


However, 
} ie mPa — Pus) <A by Lemma (5.14) with n = 1; 
n=l  ?P, | 
n'(Pn — Pus) < (N+1)(Pn — Par) — A . 
P.P.. < — <p, by Lemma (5.13); 
— 2(Pn — Pnsi) ¥ } 9 Ts p -1 > k( De — Pk+ 1) 
Di P Ps f * on oe mg P, Py 1 
A p> = by Lemma (5.14). 


n-1 


Therefore, 


= &(n 
DY Laa(n) < A afi eh, 


n=l n=l 





Also, 
. NDn 
< 
DL Tian) SA 1b a 0+ 0 ped ae »}. 
However, 
y Pe a l a oe 
2 pPo, = o (4 x) 0 | a ? 
mp. — 1 
Palos Cea 
since NP, S (nm + sats < P,, ; and 
C-) " - a a e 
2 PP -2 in > a P, Pan Ss i 
Therefore, 
e-) -_ - 
De aa(n) S A(t += “e, 
n=l n=l P,~1 
Finaily, 
Ln) = 5 5 [ocd a 


Now ¢(t) is integrable, (n + 1)p, < P,, and, due to regularity, (n + 1)p, S 
AP,_,. Therefore, 


alm) S oe > haln) SA. 





188 LEONARD MCFADDEN 


Hence (5.17) is established and the lemma is proved. 


(5.18) Lemma. Consider the Nérlund transformation N, with p, non-negative 
p 


and non-increasing, lim p, = 0 and | Ap, | non-increasing. Suppose also that 
¥(t)P(t") belongs to class L*, q21. Then 
2 . 
(5.19) li —41| SA 1+ Ee +> Gin| + | Bn iy 
n=l n=l om 1 n=l Fe | ) 
Proof. 


tn — tri) = -[ uo) & (Be — ') Bale at 


Pe Pies 1 


Going through the same steps as in Lemma (5.16) with ¥(¢) in place of ¢(¢) and 
D,(t) in place of D,(t), it follows immediately that 


rit, —ta| Ss Sp [ Ht) D pe sin (n — k)t dt 
n—l 
+f y(t) x pe sin (n — k)t au 
I/n n—l 
+ Pn [ y(t) _ P, sin (n — k)t dt 
} 0 k=0 
+ - op sin (n — k)t+ } pee sin (n — kha 
n—1 l/n k=n 
= I,(n) + I:(n) + I3(n) + I,(n). 
I,(n) = 1 [ y(t) {ys pe cos kt sin nt — > Px sin kt cos neat 
Py-1! 40 (k=0 k=0 
< lel +1 Ba) 
= Pos 
I/n 
In) S s [ | y(t) | P(t) dt, by Lemmas (5.10) and (5.11), 
n—1 #0 
_ , ¥(n") 
= A P,-1 





At: +P” ws 41 
In) sn | wold s pf morerya, 


IIA 


by Lemma (5.12), 











itive 
that 





and 


a 


11), 





ABSOLUTE NORLUND SUMMABILITY 189 


Applying the Abel transformation to I,(n), we see that 


> pm sin (n — k)t = 
k=n 


ll 


n—l 
> P, sin (n — k)t 


k=0 


Hence, 


> m sin (n — k)t + 
k=n 


= p> (De 


Therefore, 


I,(n) 


IIA 


— {Pnir Sin t + Paso Sin 2¢ + ---} 


_ {(pn4s a Pn+2) sin t 


+ (Pni2 — Pnis)(sin t + sin 2t) + +++} 


cos 3t — cos (n — k — })t 


2 sin $¢ 





, 


Po sin nt + --- + Pay sin t 


Paw >" 
+ ---» + (P; — Po)(sin t + --- + sin (n — 1)2) 
+ Py(sin t + --- + sin nt) 


P,_2) sin t + (Pas — P,-s)(sin t + sin 2¢) 


cos $t — cos (n — k + 4) 
wi b+ 





fo 2 sin 3¢ 
n—l 
| a. P, sin (n — k)t 
Eafe 
cos (n — k — 4)t Pr _ cos(n — k + 4)e 
Pes) 2 sin 3 P, p> ; 2 sin } 


cos 4 P.s ) 
: n 4 — 1 ° 
? 2 sin 1? ( P, 


v t x 
p. 1 [ 2 al p> (Pe — Desi) cos (n — k — $)tdt 
n— n x 2 t—n 





Pn i y(t) n—l | 
: oo k 2 t | 
+ PP 1/n 2 SiN Hi 2g? cos (n + })tdt 
2 — 
z. n t 
> | v(t) | 


Fe Pu ijn 2 tan 3t 


T4,(n) + I42(n) + I4,3(n). 


Using the same method as in Lemma (5.16), we get 


2. E / oe 
$,tato) + ta safe + ESE) 











190 LEONARD MCFADDEN 
Now 


2 r | | r 
Dr | Y(t) | e Dn ne eae 
=A— $4c4- f t 
Tuan) : Py Pa-1 Jin 2 tan 3 - ; P, Pai Jijn vit) | PU) dt, 


since p, S po = Py < P(t”), and, by the same argument as in Lemma (5.16), 
20 co) Yi n? 
D Ias(n) < A {1 +d (ni i, 
n=l nat Py } 

hence, (5.19) is established. 

(5.20) Lemma. [If p, is non-negative and non-increasing, and if we write y(t) = 

ps pre, then, for t in |h, x}, 

k=0 

(5.21) y(t + 2h) — y(t)| s Aht'P(h’). 

Proof. Let r = [€"] and @ = [h"]. It is clear that r < 6. Now 


y(t + 2h) — y(t) = i peiexp (ik(t + 2h)) — exp (ikt)} 


6 20 


(> r 2, bl a pelexp (ik(t + 2h)) — exp (ikt)} 


BS, + Me + &, 


from which the term S; may be absent. 


| Si | > pe exp (k(t + h))2i sin kh 
k=0 


2>> makh, since kh < ht’ < 1 forO < k S71, 


k=0 


IIA 


2ht >> pm, < Abt P(t") < Aht™ P(A). 


k=0 


IA 


By (3.02), we have 


| Ss | 


lA 


Ap(h™) ; max > (exp (tk(t + 2h)) — exp (ikt)) 


+1lsnso@ | k=0 





- _ |1— exp (c(n + 1)(¢ + 2h)) _ 1 — exp (i(n + 1)#)| 
ee) engene 1 — exp (z(t + 2h)) 1 — exp (7) 


1 


oi < Ap(h™")( sin3t)* < At“ p(h™) 


lA 


Ap(h") : 


Aht™ P(h), by Lemma (5.10). 


lA 














(: 

















ABSOLUTE NORLUND SUMMABILITY 191 


Applying Abel’s transformation to S,, we get 
6 


S, = a pe (exp (k(t + 2h)) — exp (tkt)) 
_—~ _ — exp (i(k + 1)(t+2h)) _ 1 — exp (i(k + 1)t) 
" 2 (Pe puss) |! 1 — exp (i(t + 2h)) 1 — exp(it) — S 
oy ( — exp (i(r+1)(¢+ 2h)) 1 — exp (u(r + 2a 
sites 1 — exp (7(t + 2h)) 1—exp (it) 
+4 fl — exp («(0 +1)(¢+2h)) 1 — exp (i(6+ 1) 
ot 1 — exp (i(t + 2h)) l—exp(it) | 
= Soi + So2 + So. 
| Ses| < Ap(h™) | = = i < At'p(h") < Aht™ P(h"). 
g.. = o,,, (exp (il + I(t + 2h)) _ exp (i(r + DE) 
(5.22) ae ee To exp (i(t + 2h)) 1 — exp (it) 


(exp (7t))(exp (72h) — 1) \ 
| f= exp ((t + 2h))) (1 — exp (it)) 


Let us consider the last term in the braces. 


(exp (7t))(exp (72h) — 1) 


(1 — exp (i(t + 2h)))(1 — exp (:t)) 


(5.23) = __ 2i(exp (it))(exp (th)) sinh —__ 
sai = | = Q(exp Gad + 2h))) (exp Gid)) sin Gt + 2h))28in Ht | 


1 sin h 2 
= — - —_________ $ : 
2 sin ($(¢ + 2h)) sin 3t ~ _ 


Next, we consider the remaining terms in the braces on the right of (5.22). 














exp (i(r + 1)t)_ _ —_ exp (a(r + 1)t) _ texp (i(r + 3)t) 
1 — exp (it) sad 2i(exp ‘p (Fit) sin3t 2 sin 3 
= * cos (r + $)t + tsi (+ + 4) 
2 sin $t 
= f(t) + ig-(t), 
where 
, _ _ sin(r + $)t _ cos (r + $)¢ 
fd) = 2Qsinkt ’ =  2sin3t * 
Hence, 


exp (i(r + 1)(¢+2h)) _ exp (i(r + 1)t) 
(5.24) 1 — exp («(¢ + 2h)) 1 — exp (it) 





= f(t + 2h) — f(t) + i{g.(t + 2h) — g,(t)}. 







192 LEONARD MCFADDEN 


By the mean value theorem, 


fAt + 2h) — f(t) = 2hf(t + 2h), 0<4 <1, 


{4 sin (7 + 3)(t + 20h) cos (+m) — sin (+ ") 
\ 








x ((r + 4) cos ((r + 4)(¢ + 20,4)))| 
i | 
sin? § + 26; h 
2 


Therefore, | f(t + 2h) — f(t) | < Aht™’, and similarly | g,(¢ + 2h) — g,(t)| < 
Aht’. Hence 













Soo| < Aht p(t") s Aht'P(t") < Aht'P(h'). 






1 — exp (i(t + 2h)) 1 — exp (it) 


S __ exp (i(k + 1)(t + 2h)) | exp (i(k + 1)t) 
, (Pe — Pass) 1 — exp(i(t + 2h)) — 1 — exp (it) 


foxp Kony (4) = 1) | 
(1 — exp (i(t + 2h))) (1 — exp (if) 


inn + wi pss)" — exp (i(k+1)(t+2h)) _ 1 — exp (i(k + ue 





ke=r+l 















We have already shown in (5.23) that 


(exp (it))((exp (24) =D |e gap 





(1 — exp (a(t + 2h))) (1 — exp (at)) | = * 








6—1 
and, since 8 (Pi — Pex) = Prat — Po S Prt S p(t’), we have 


t+1 









S . _ (exp (it))((exp (i2h)) — 1) 
2, (Pe — Pes) (1 — exp (i(t + 2h)))(1 — exp (it) 


< Aht’ p(t") < Aht™ P(h™"). 







As in (5.24), we have 


exp (i(k + 1)(t-+ 2h) _ exp (i(k + 101) 


1 — exp (i(t + 2h)) | 1 — exp (it) 
= fit + 2h) — filt) + ifgelt + 2h) — gx(t)}, 








so that, by the mean value theorem, we have 


| filt + 2h) — felt) | = | Vhfi(t + 26,h) |, 0<4 <1, 








( 


wy 





IIA 








ABSOLUTE NORLUND SUMMABILITY 193 


} sin ((k + 4)(t + 26h) cos (' + MN) — sin (“ + an) 








2 


x (k + 3) cos ((k + $)(t + 26:h)) 
26 + 26,h 
sin — 





2 
< Aht”® + Aht™ k, 


and, in a similar manner, | g.(¢ + 2h) — gx(t)| < Aht* + Aht'k. Therefore, 


= _ { exp (s(k + 1)(t + 2h)) , exp (i(k + 1)t) 
2, — Pes) \— Temp Gea) + 1 — exp @) } 


IIA 


6-1 
A D (pe — Pesi)(ht* + htk) 
t=—T+ 


6—1 


Aht*p(t"’) + Aht* > (pe — peas) 


k=r+1 


IIA 


Aht' P(t) + Aht{(r + 1) pear + Po} 
Aht*P(h™) + Ahtp(t) + Aht™P(h™) 
AhtP(h™"). 


IA IA WA 


Hence, | Se1| < Aht'P(h™). Therefore, 
y(t + 2h) — y(t)| S| Si] +| Ss] + | Seal +| Sea] + | Sea| S$ ARC P(A). 


(5.25) Derrinition. f(x) is said to belong to the class Lip a fora = x S b 
if | fix +h) — f(xz)| SAlh\* fora Sz b. 


We will now prove the following theorem. 


(5.26) THrorem. Let N, be a Nérlund transformation with p, non-negative 
and non-increasing, lim p, = 0, | Ap, | non-increasing and satisfying the condi- 


noo 


tions (1) > Pik” < A and (2) > Pe'k-** < A. If f(x) belongs to class Lip a, 
0 <a < 1, then S(f) and S(f) are both summable | Np |. 
The following lemmas are pertinent to the proof of this theorem. 
(5.27) Lemma. If f(x) belongs to Lip a, then $(t) and y(t) also belong to Lip a. 
Proof. 
| o(t + h) — of?) | 


\fla+ttith) +f(2—t—h)-fix+i —f(ie—-—d| 
if(@+t+h)—f(e+0)|4+\|f(e-—t-h—-fx-d| 
Alh|*. 


IAs 


lA 





194 LEONARD McFADDEN 


Similarly, | y(t + h) — ¥(t)| S Alh|*. 





(5.28) Lemma. If f(x) belongs to Lip a and (5.26), (1) is satisfied, then 


o(t)P(t") and ¥(t)P(t") belong to L’. 


Proof. By Lemma (5.27), ¢(#) and y(t) belong to Lip a; they are therefore 
uniformly continuous and hence bounded on [0, x]. Consequently 


| o(t) | P(t) dt <A [ P(t) dt 
0 0 


A t P(t) dt 


l/r 


1 -) 
| t P(t) dt + | t P(t) dt 
l/r 1 


oo 


IA 


k=1 


A similar result holds for (¢). 


A+> Pk?’ sAtADLPBRE'SA. 


(5.29) Lemma. [f f(x) belongs to Lip a, a > 0, and conditions (5.26), (1) and 


(5.26), (2) are satisfied, then 


> &(n)Pris < A, > win) Pr Ss A. 


n=l n=l 


Proof. Clearly | ¢(t)| < At*. Therefore 


l/n 
&(n") < 4 t* P(t”) dt 
0 
=A / t** P(t) dt 
<A> Pik 
k=n 
20 ; c) 4 
<A {> Pi | {= om (Schwarz’s inequality) 
kaon k=on 
2 } 
<A { ) aa 1 by (5.26), (1), 
kon 
< An. 


Hence 


= —l\ p-1 ¥ —1, —a—} 
a-l = n % — ’ 
Kan Pais ADP n*'s A 
n=l 


n=l 


The second inequality follows in a similar manner. 


by (5.26), (2). 








then 


fore 


and 


ity) 


(1), 


(2). 











ABSOLUTE NORLUND SUMMABILITY 


(5.30) Lemma. If p, is non-negative and non-increasing and pb Pik™ 


k=1 
then P?n” <A. 
Proof. Let y, = Pik’. Then 


Yes — Ye = Pigalk + 1)" — Pik7 
= kk + 1) 4kPiy — (& + 1)PH 
= kk + 1)" (2kpaPs + hpi — PH} 


< APik™. 
Therefore, 
n—1 n 
wW-n= > (yeu — 1x) SAD PRP SA. 
aa] k=1 
Hence yn S A. 


(5.31) Lemma. If p, ts non-negative and non-increasing, then 


a oo 
p> P3170 < A > ) ne. 


vel n=l 
Proof. First we observe that 


Pov Po: = (Poe-1 + Pee-141 oa ose + pa) Poe 


(5.32) . , 
S 1 + 2’ pov-1 Poo-1 S A. 
Therefore, 
2 Qe 
fear ca ss eS 2 re 
v=1 v=1 kee20—1+1 


Qv 


< A b p P; | a 


v=l k=20—1+1 


s A> P,'n. 


n=l 


195 


2 < A, 


Proof of (5.26). In view of our hypotheses and Lemma (5.28), it is clear that 
the conditions of Lemmas (5.16) and (5.18) are satisfied; therefore, the in- 
equalities (5.17) and (5.19) are true. Hence, due to Lemma (5.29), it suffices 


to show that 





sv lan| + 1Bnl ss | Gin | + Bal 
) a - 4 p> Pant > a 


Now 





196 LEONARD MCFADDEN 





[ a(t) dt <A [ PC”) dt, by Lemma (5.11), 
0 0 
(5.33) =A P*(t)t” dt 
l/r 
<A+AD Pik SA, by (5.26), (1). 
k=1 


Hence, since ¢(t) is bounded, a, is the Fourier coefficient of an even function 
belonging to L*. Similarly, &, is the Fourier coefficient of an even function 
belonging to L’, and 8, and 8, are Fourier coefficients of odd functions belonging 
to L’. The Fourier series of o(t + h)a(t + h) — o(t — h)a(t — h) is 


4< , ; 
—- > a, sin nt sin nh. 
T n=l 


On applying Parseval’s relation, we get 


Dak sin’ nh <A [ [ot + Adalt +h) — o(t — Walt — h)}* at 
n=l 0 


IIA 


A (f° a(t + h){o(t + h) — o(t — h)}P at 
+ . o(t — h){a(t + h) — alt — aya 
0 


Ji(h) + Joh). 
Taking h to be positive, it is clear from (5.27) and (5.33) that Ji(h) S Ah**. 
Now 


Joh) = A [ g(t — h)falt +h) — a(t — hyde 


II 


A i $(t) {a(t + 2h) — a(t}? at 
Lh 
<A [ gat + 2h) dt + A [ FWaX() dt 
+ Af sWlalt + 2h) — a(0)}*a 


= Joi(h) + Jo2(h) + Joa(h). 


Consider each of these integrals separately. First, 


ww 1 ) 
2a peg * 
Aft I (sa di 


Ah*™*** P*(h™) 
Ah™, by Lemma (5.30). 


Joa(h) 


lA 


IA WA 








11) 


’ 


' i) 


1h?” 


iV? dt 


.30). 














ABSOLUTE NORLUND SUMMABILITY 197 


Next, 
h 
Jox(h) < A | t* P*(t*) dt 
0 
=A | t=? P*(t) dt 
hat 
< A > P; ptt 
k=[h-1] 
sAk™ D> Pik* 
ke (h~2] 
< Ah", by (5.26), (1). 
Finally, 
Jos(h) < Ah’ P*(h) | ae’ by Lemma (5.20), 
h 
( ate » asi = i 
| Ah? P*(h) _; a # 3, 
< | 2a — | 
| An Ph )(log x — log h), a = 3, 
i P*(h"), a < }, 
< Ah’ P*(h-*), a >}, 
| Ah? P*(h™") log h™, a = }, 
( Ah™, a < 4, by Lemma (5.30), 
< | Ah’ P*(h-"), a>}, 
| Ah? P*(h) log h™, owt 
Therefore, 
> a’, sin? nh < Ah*™* + Ah’ P*(h™) log h™. 
n=l 
Now let us write h = 2/2N; then the above inequality becomes 


> a? sin’ (nx/2N) < A{N~* + N° P*(2N/z) log (2N/x)} 
n=l 


< A{N-* + N*P*(N) log N}. 
Taking N = 2”, we get 


av 2v 


a <2 > of, sin’ (nx/2°**) 


n=2e—1+1 n=2e—1+] 


<2 > a’, sin® (nx/2°**) 


n=l 


< A{2-* + v2- Pio}. 





198 LEONARD McFADDEN 


Therefore, applying Schwarz’s inequality, we get 


£ 
Qv l ov \i Qv 1 
> 1 = 2° J > -™ F 
An I n < 4 An? 4 I n ? 
n=2e—1+1 | manQ¥—1+] ) (n=2? 141 ) 


19 


S A{2-* + v2 Pro} -20-? Pre 
= Af Pr. + tate Py Peer} 
Air Pe. + FT), by (5.32). 


Il 


Hence, 


ov 


A z. > 2 a,|P," 


n=l veel n=2°—141 


A > {Px Seas ) + pig err) y 


- 
Ly 
—~ 
~ 
el 
i 


< 
s4adk 
2 
<A > (Pg et) 4 nig te) by Lemma (5.31), 
n=l 
2) 
—1 —(a+4) 
SAt+AL Pin 
n=l 
s A, by (5.26), (2). 
eo) 2 
p — = i ai 
It can be shown in a similar manner that >>| 8,| Pa’, D-|a@,|P;' and 
n=l n=l 


2 
_ ° 
> |B. | Px’ are each bounded, and so our theorem is proved. 


34) Coro.iary (Bernstein’s theorem). If f(x) belongs to Lip a, } < 
a & 1, then S(f) and S(f) converge absolutely. 

Proof. Absolute convergence is summability | C, 0 |, or summability | N, |, 
Clearly p, is non-negative and non-increasing, 


where po = 1, pp» = 0, n ¥ 1. 
Thus it suffices to show that (5.26), 


lim p, = 0 and | Ap, | is non-increasing. 
no 


(1) and (2) are satisfied. Now 
(1) DP = Dk SA, 
k=l 


(2) > Pike? = 2 k** 5 A, 
t= 


k=1 
sincea + 4> 1. 
(5.35) Corotiary (Hyslop’s theorem). If f(x) belongs to Lip a,0 < a & 3, 
and if 8 + a > 3,8 > 0, then S(f) and S(f) are summable | C, 8 |. 


Proof. By Theorem (2.20), |C, 8| C |C, y| for 8 < y. Hence we can 
assume 0 < 8 < 3, and we only need to show that the conditions of Theorem 











2). 


ind 


<1 
= 3 


can 
rem 











ABSOLUTE NORLUND SUMMABILITY 199 


(5.26) are satisfied in this case. It is clear that p, is non-negative and non- 
increasing, lim p, = 0 and | Ap, | is non-increasing. Now 


n-?o 


p, = @+VG+2)-:-G+n) _ » 
; n! “8 
Therefore, 


(1) DPik’s 
k=1 


< A, since 2(6 — 1) < —1; 


IA 
M 
| 


(2) > Pike sad ko SA, sincea +B8+}>1. 


k=1 k=1 
(5.36) Coro.tuary. Let the Nérlund transformation N, be defined by 
_ (n +> ey ‘log'** (n + c) 

ce 1 log'**c 


where 0 < 8B < 3, € > Oand loge = $#(1 + «€). Let f(x) bglong to class Lip a, 
where a + 8 = 3. Then S(f) and S(f) are both summable | N, |. 


b 


Proof. Obviously p, 2 0 and p = 1. We will write 
(x +c)? Tog’ ** (x +e) 
p(x) = — 
c! log't¢ ¢ 
Then 
. »)P-2 e . 
pi(z) = ETSY ee SFO 114 4 (6 — 1g ls +0) 50 


ce log!** Cc 


for all ec = O when log c > 2(1 + «). Therefore, p(x), and consequently 
Pn = p(n), is non-increasing. Also, 


“ (x +c)” log" (x + ¢) { ai i ee + 1) 
p'() = c! log'** ¢ | + =~ 24 log (x + c) 
+ (8 — 1)(8 — 2) log (x + 0} 
20 
for all x = 0 when loge = $(1 + €). Therefore | p’(x) |, and hence also | Ap, |, 


° ° . x° B 1+ + 
is non-increasing. Finally, P, ~ n° log “‘n. Therefore, 


1) S Pik’? s Ad log" "k < < A, since 1 — 6 >}. 
k=1 


(2) > P; | < A > (log’ Fe Jp. RateHty-1 
k=l k=1 


A > (k log’** ky 
k=l 


SA. 





200 LEONARD MCFADDEN 


(5.37) Corotitary. Let the Nérlund transformation N, be defined by 
= © log’ (n + ¢) 
= (n + c) loge’ 
where log c = 3¢ > 0. Let f(x) belong to class Lip 3. Then S(f) and S(f) 


are summable | N, |. 





Proof. Clearly p, 2 0 and p = 1. If we write 
c log* (x + c) 


pz) = (x + c) logtc’ 
then 
, clog‘ (x + ¢) J € Ve 
= mrpeueinee 0 8 
p'(z) (x + c)* log‘e tine (x +c) ) 
for x 2 O and loge 2 e. Therefore, p, is non-increasing. Also 
€ ( = 2 
g's) = Ete! o—) . _® ae" 2} 
(x +c) logtc \log? (a +c) log (a+ +c) ~ Jog Ge +c) 
> ¢ log* (x + c) { ot 3e ae 
™ (x + c)* logte | log (x + c)) 
20 


for x 2 O and log c 2 $e. Therefore, | Ap,| is non-increasing. Finally, 
1+ 
P, ~ log“ n. Therefore, 


SPS A > log“ ee 
k=l k=1 k2 


> Pik s A> (klog'*k)"' < A 
k=1 k=1 


(5.38) Derrtnirion. f(x) is said to belong to class Lip (a, qg) fora S x S bif 


IIA 


if | f(x + h) — f(x) |*dx \" Ss Alh|*. 


(5.39) TuHeorem. Let N, be a Nérlund transformation with p, non-negative and 
non-increasing, lim p, = 0, | Ap, | non-increasing, and satisfying the conditions 


n-*>2o 
«o 
2@/(@—2) 7.—-2 
oe 
k=l 


(2) DPPk*'Ss A. 
k=1 


IIA 


A, 2<q<-o, 


If f(x) belongs to class Lip (a, q) on [0, x], 0 <a S 1, ag > 1, then S(f) and 
S(f) are both summable | N, | 











and 














ABSOLUTE NORLUND SUMMABILITY 201 


The following lemmas are pertinent to the proof of this theorem. 


(5.40) Lemma. If f(x) belongs to Lip (a, q) on [0, x], then o(t) and y(t) belong 
to Lip (a, q) on (0, x}. 


Proof. Clearly 
o(t +h) — o(t)| S |f(et+tt+h) -—fixt+d|+\fe-—t—h) -—f(ix—-d)|. 


Hence, by Minkowski’s inequality, 


IA 


if | g(t + h) — ¢(t) |* aaa if ‘f(x i+ h) a f(x 4 iit ae 


1 


+4 [ise - t —h) fla — Ode 


IIA 


Al\h|*. 
A similar proof holds for y(¢). 
(5.41) Lemna. [If f(x) belongs to Lip (a, q),q 2 1,0 < a S 1, aq > 1, then 


f(x) is equivalent to a function of class Lip (a — 1/q). 
Proof. This result was obtained by Hardy and Littlewood [4]. 


(5.42) Lemna. If f(x) belongs to Lip (a, q),q 2 1,0 < a S 1, ag > 1, and 
condition (5.39), (1) is satisfied, then o(t)P(t™’) and ¥(t)P(t”") belong to class L’. 


Proof. Due to Lemmas (5.40) and (5.41), ¢(¢) and y(t) belong to class 
Lip (a — 1/q), and are therefore uniformly continuous and hence bounded on 
(0, xz]. Therefore 


IIA 


[ | g(t) | P(t") dt < A [ P(t") dt 
0 0 


A t* P(t) dt 


1/x 


A+AD> Pik? 


k=1 


IA 


<A+A > Pit! b* 5s A. 
k=1 


The result for ¥(t)P(t”') follows in the same manner. 
(5.43) Lemma. If f(x) belongs to Lip (a, q),q 2 1,0 < a S 1, ag > 1, and 
conditions (5.39), (1) and (5.39), (2) are satisfied, then 


YS a(n“) Pa < A, YS vn Pr S A. 


n=l n=l 





202 LEONARD MCFADDEN 


Proof. By Lemma (5.40), ¢(t) belongs to class Lip (a, g), and, by Lemma 
(5.41), (é) is equivalent to a function of class Lip (a — 1/q). Hence, 


®(n') S A [-~ P(t") dt 
=A [ gerie* P(t) dt 
k=n 
<A 2 prale® {> prtttettie ear me 24 


by Hélder’s inequality. However, 


( @ (q+2) /2¢ 
12 k (1+a@+1/q¢)(2q/(q+2)) 


k=n 


IIA 


A { gq Ae ttan+2—e—2) /(e+2) } (q+2) /2¢ 


An >, 


IIA 


Therefore, (n") < A n-** and consequently 


> Hn") Preis AdDPrin*'sa. 
n=l 


n=l 


The second inequality follows in a similar manner. 


(5.44) Lemma. If >> P32!“ /n? < A for q > 2, then P2./n'*"* < A. 


n=l 


Proof. First, we observe that for a given r > 0 


Pr Paw _ (n+ 1)Pa — mPa 
n n+l n(n + 1) 
Pr n(P, — Pris) 


~ n(n + 1) n(n + 1) 


Pi — Paar = Pa — (Pa + pats)’ 


_ a r Pn+i\ 
Pa (1 + ott) 





r(r — 1) Pati ) 
Fe 2! tw 4 + 


— pr ( 7 Patt 4 r(r — 1) Past de ves 
“\ Fe ae. 7 


= P,- p,(1 + Patt 4 











ABSOLUTE NORLUND SUMMABILITY 203 


However, Pnai/Pn S pn/Pn S 1/(n + 1) < 1/n. Hence, 




















Pour , Mr — 1) Dass 
3" # AY 
1 rr —1)]1 lr(r —k +1)! 1 A 
e chy + — eo = my, te 
a (: 2 n t t k! nit + —- 
Therefore, 
or _ pr a a Pn+i r(r = 1) Dnt Ae PF, 
irs Festi Ps r P,, + 21 P? + = a t. 
and 
a sale ae < : ‘ Ph AP, < A PF, : 
n n+1\/~ n(n+1) nin+1)~ wv 
Consequently, 
yy Pr Pi P., Fit ) 
a T+(3 a) + +(@ n—1 
,, |Ps Pi Pr, _ Pr 
<p =>. =2 ae - 
i+ 2 1 + + n n—1l 
r AP} AP 
<p , _ 
i+ 2? + ” (n — 1)? 
n—l1 pr 
=aAya 
k=1 ~ 
On taking r = 2q¢/(q — 2), this becomes 
2q/(q—2) n 2q/(q—2) 
P> \¢ < A P;* gq < A 
n  — 
Therefore, 
2 2¢/(q—2)\, (q—2)/ 
.- (a) "<4. 
ni-2/a@ n — 
cs 1/n 
(5.45) Lemma. If >> P22! /n? < A, q > 2, then P*(t") dt < An™", 
n=l 0 


Proof. On making the substitution ¢ = 1/u we get 
1/n © 72 
P(t) dt = / i du 
0 n U 





.~ Pi, 1 
= 4» eth jar 


kan 











204 LEONARD MCFADDEN 


(2 pPpelte-® (q—-2)/q ( @ 1\** 
<A 12 R \ {x Be (Hélder’s inequality) 
kon a ) n J 


< An". 


Proof of (5.39). From our hypotheses and Lemma (5.42), it is clear that 
the conditions of Lemmas (5.16) and (5.18) are satisfied. Therefore, the in- 
equalities (5.17) and (5.19) are true. Hence, in view of Lemma (5.43), it suffices 
to show that 





"e~t aa n=1 Pant - 


ae < A, yy | dal + B,. | wy 
n=l 


As in Theorem (5.26), we obtain 
2 


> a! sin’ nh < Ai a(t + h)| o(t + h) — ot — h) Pat 


n=l 


+ [ ot — h)\ a(t + h) — alt — ay ay 


Ji(h) + Jo(h). 


Now, taking A to be positive, we get 


Jyh) = A [ a(t + h)|o(t +h) — o(t — h) Pat 


lA 


A if | o(t + h) — o(t — A) rah "tf | a(t + h) [Pe au 


\ 1—2/¢ 


Ah [ jae + hy fel ae} 
0 


IA 
i 


I 
— 
= 
> 
~ 
ey 
+ 
> 
- 
-_- 
~ 
— 
a) 
fl 
Ss 
I 
8 
a 
~ 
. = 
® 
< 


lA 
ns 
= 
tw 
Rg 
= 
4 
z 
> 
% 
a 
| 
#8 
~_-, 
= 
5 
~ 
& 
2 


[h—1}+1 


A+ dD Pytite® ee 


Amt ; 1—2/q¢ 
= Ah™* { / ai i at} 
(r+h)~! } 


IIA 

po 

= 
to 
R 


lA 

pa 

~*~ 
8 








1at 
in- 


ces 


at} 











ABSOLUTE NORLUND SUMMABILITY 
We will write 
Juh) = A | g(t — h)| a(t + h) — alt — h) [Pat 
0 
h m , h ; 
A¢ | ¢ (t)a°(t + 2h) dt + [ ¢'(t)a’(t) dt 
\ Lh Lh 
+ ¢ (t)| a(t + 2h) — a(t) 2 ab 
h 


IIA 


Joi(h) + Joo(h) + J2,a(h). 


Then we observe first that 
h 
Jos(h) < Ah?*?/¢ [ a(t + 2h) dt, by Lemma (5.41), 
h 
< Ah**7/¢ hP*(h™) 
< Ah”, by Lemma (5.44). 


Next, 
h 
Joo(h) < Ah’*™*'* | P*(t”) dt 
0 


2a—2/q+2/ 
_— I 
>, 


= Ah™. 


y Lemma (5.45), 


— 


Finally, 
Jo3(h) < Ah’ P*(h") / t°¢7(t) dt, by Lemma (5.20). 
h 


Now, 
[ to (t) dt < A [i Pe by Lemma (5.41), 
Ft for 2a — 2/g-—1< 0, 
s{4 for 2a — 2/g-— 1>0, 
A log h™ for 2a — 2/g-—1=0. 
Therefore, 
Ape Fa) fora < 1/q + 1/2, 
Jos(h) < : Ah’ P*(h7 om * fora = 1/¢ + 1/2, 
La Ah’ P*(h fora > 1/q¢ + 1/2, 
Ah” fora < 1/q¢ + 1/2 by Lemma (5.44), 
ne P*(h*) log h™ fora = 1/q + 1/2, 
| Ah? P*(h™) fora > 1/q + 1/2. 


206 LEONARD MCFADDEN 


Hence, 
> af sin? nh < A(h™ + h? P(A) log h7), 
n=l 
and the remainder of the theorem follows exactly as in Theorem (5.26). 


(5.46) Corotiary. [f f(x) belongs to Lip (a, q), 1/2 < a S 1, q > 2, then 
S(f) and S(f) are absolutely convergent. 


Proof. Clearly p, is non-negative and non-increasing, lim p, = 0, and 


Ap, | is non-increasing. Condition (5.39), (1) is satisfied since P, = 1 for 
all k, and condition (5.39), (2) is satisfied since a + 1/2 > 1. 


(5.47) Corotuary. [f f(x) belongs to Lip (a, q),1/2 2 a > 1/q,8 +a > 1/2, 
then S(f) and S(f) are summable | C, 8 |. 


Proof. First we observe that, for given a and q with 8 + a > 1/2, we can 
find 8, S 8 such that 1/2 = a > 1/2 — B, > 1/q; hence it suffices to establish 
our result for such 6; , since, by Theorem (2.20), | C, 6:| C|C, 8! for B S B. 
There will then be no loss of generality in assuming 1/2 = a > 1/2 — 8 > 1/q. 
From the last inequality we deduce 8 < 1/q + 1/2 < 1. Hence, p, is non- 
negative, non-increasing, lim p, = 0 and | Ap, | is non-increasing. Moreover, 


n->oO 


P, = n’: hence, 


(1) ye piia®) k? <A > je(?Bal(a-2))—2- 


k=1 k=1 


But 
28q 
on 2= 
i 


gf — 8) - 2, 4 g/2+ I/g-2_, 
q-2 q-2 ‘ 


Therefore, 


oo 
2 —2) 7-2 
>, Pit? E* ¢ A. 
k=1 
Also, 


oo 


(2) DSP's 


k=1 


sincea +8+1/2> 1. 


Brown UNIVERSITY. 


BIBLIOGRAPHY 


1. S. Bernstern, Sur la convergence absolue des séries trigonometriques, Comptes Rendus, 
Paris, vol. 158(1914), pp. 1661-1663. 

2. L. S. Bosanquet, The absolute Cesdro summability of a Fourier series, Proceedings of 
the London Mathematical Society, (2), vol. 41(1936), pp. 517-528. 








then 


and 


for 


ndus, 


gs of 








ABSOLUTE NORLUND SUMMABILITY 207 


. M. Fekete, On the absolute summability (A) of infinite series, Proceedings of the Edin- 


burgh Mathematical Society, (2), vol. 3(1932-1933), pp. 132-134. 


. G. H. Harpy anv J. E. Lirrtewoop, A convergence criterion for Fourier series, Mathe- 


matische Zeitschrift, vol. 28(1928), pp. 612-634. 


. J. M. Hystop, On the absolute summability of Fourier series, Proceedings of the London 


Mathematical Society, (2), vol. 43(1937), pp. 475-483. 


. E. Koepeturantz, Sur les séries absolument sommable par la méthode des moyennes 


arithmétiques, Bulletin des Sciences Mathématiques, (2), vol. 49(1925), pp. 234- 
256. 


. F. M. Mears, Absolute regularity and the Nérlund mean, Annals of Mathematics, (2), 


vol. 38(1937), pp. 594-601. 
N. E. N6riunp, Lunds Universitets Arsskrift, Avd. 2, vol. 16(1919), no. 3. 


I 
. G.Szec6, Bemerkungen zu einer Arbeit von Herrn Fejér tiber die Legendreschen Polynome, 


Mathematische Zeitschrift, vol. 25(1926), p. 177. 


. J. M. Wurrraker, The absolute summability of Fourier series, Proceedings of the Edin- 


burgh Mathematical Society, (2), vol. 2(1930-1931), pp. 1-5. 


. G. F. Worono1, Extension of the notion of the limit of the sum of terms of an infinite 


series, Annals of Mathematics, (2), vol. 33(1932), pp. 422-428. 








ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 
By Ear J. MIcKLe 
Introduction 


° ry ° ° > . ‘ . is - 
In a paper entitled Uber adjungierte Variationsprobleme und adjungierte Ex- 
1 . . . ° ° 
tremalfldchen, Haar [1] has given a variation problem associated with a non- 
parametric double integral variation problem of the type 


(1) J|z] = [| F(p,q)dxdy = min, p= =» q= 


R 
in such a way that an extremal surface of the problem (1) determines an extremal 
surface of the associated problem; and a variation problem associated with a 
parametric double integral variation problem of the type 


Iz, y, z] = II $(A, B, C)du dv = min., 
(2) . 


Aa|\ *% pel* ™ gal 
So & a Ze He 

in such a way that an extremal surface of the problem (2) determines an extremal 

surface of the associated problem. It is the purpose of this paper to show that 

the method used by Haar to determine such associated problems can be used to 

determine a group of such associated problems. With this end in view we give 

a summary of the results of Haar. 


0.1. Non-parametric adjoint variation problem of Haar. Let us assume that 
the integrand function F(p, q) of the problem (1) is of class’ C’” in a region S$ 
of the pg-plane and define the functions 


X(p, q) = —F,(p, 9), Y(p,q) = —F,(p, 9), 


(3) . 
Z(p,q) = F — pF, —QF,, Alp, q) = FopF a — Foe - 
Let us further assume that the functions F(p, q), Z(p, q), A(p, q) are different 


from zero everywhere in S and that the transformation 


X(p, q) Y(p, q) 1 
Ts: _ ~ ’ oe F. ’ ali rs 
= m= 70g)’ &"~ —ZQ@,9° = —7E5 


Received November 21, 1941. 

1 Numbers in square brackets refer to the bibliography given at the end of this paper. 

2 A function is said to be of class C™ in a region if the function together with its partial 
derivatives up to and including those of the n-th order are continuous in the region. A 
surface z = z(z,y) is said to be of class C™ in a region of the zy-plane if z(z,y) is of class 
C™ in the region. 

3 The particular choice of subscripts here used is for reference in later work. 


208 





ai + 


Lew) 


wi 
an 
fot 
of 





1a] 


hat 
1S 


ent 








ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 209 


As(p, q) = PtH! > 80 


earries the region S in a one-to-one and continuous way into a region S; of the 
psqs-plane in which the function F3(p3 , gs) is defined. Haar called the variation 
problem 


023 _ O23 


(4) J3{z3] = I/ F3(ps, 93) drs dys, B=, = 
OYs 


R3 


? 


considered in an (23, Ys, 23)-coordinate system, the adjoint variation problem 
of (1). 

A surface z = z(x, y) of class C” is called an extremal surface of the problem 
(1) if 2(x, y) satisfies in R the Euler-Lagrange equation‘ 


Fy(p, g)r + 2F pep, 9)8 + Faalp, git = 0, 
(5) — dz hoe az _ ez 
da?’ "  Oxdy’ dy? 
This equation is a necessary and sufficient condition that there exist three single 
valued auxiliary functions é(x, y), n(x, y), ¢(x, y), uniquely determined up to 
additive constants, which together with the function z(x, y) satisfy in R the 
system of equations 


di = Y(p,q) dz — Z(p, q) dy, 
(6) dn = Z(p, q) dx — X(p, q) dz, 


d¢ = X(p, q) dy — Y(p, q) dz, 
where dz = pdx + qdy. This statement follows from the fact that the cross 
differentiation test for exact differentials applied to the right sides of (6) reduces 
to the Euler-Lagrange equation (5). 
Haar showed that the transformation 
rs = &2, y), ys = n(2, y), 23(23, Ys) = F(z, y), 
O(X3, 1 a 
tel: bylz] = 272%) _ pz x0, 
d(x, y) 
Es(zs, ys) = 2, (ts, ¥s)=y, Fa(%s, ys) = 2(z, y), 
where it is assumed that the interior of the region R is carried in a one-to-one 
and continuous way into the interior of a region R; of the z3ys-plane, defines 
four functions 23(23 , ys), &3(23 , Ys), m3(23 , Ys), £3(%s , ys) Which satisfy the system 
of equations 
dt; = —F3,, dzs — (Fs — psF'sp, — Qsl'sq) dys , 
dns (Fs — DsF sp, cd QsF'3q5) dz3 + Fp, dzs , 


dts = — Fp, dys + F3q; dx3 e 


‘See, for example, O. Bolza, Vorlesungen tiber Variationsrechnung, pp. 652-656. 








210 EARL J. MICKLE 


That is to say, the surface z; = 23(x3, ys) is an extremal surface of the adjoint 
variation problem (4). 

0.2. Parametric adjoint variation problem of Haar. Let us assume that the 
integrand function ®(A, B, C) of the variation problem (2) is of class C” in a 
region > of (A, B, C)-space, that the quantity (€°/C°)(®4.@s5 — ®%4s) is every- 
where different from zero in 2, and that the transformation 


A = OP, 9 B = PP; , C = PD_, 
sr sah. . a(A, B,C) _ 4 172 2 
OA, B,C) = (A,B,C), 5B Gy = (@/CNGra¥on — Vin) # 0 


carries the region > in a one-to-one and continuous way into a region 3 of 
(A, B, C)-space in which #(A, B, C) is defined. Haar called the variation 


problem 


Tz, 9, 2) = [[ aa, B, C) du dv, 
G 


- Gu 2u z 
{ = B= , 
jo 2 z de Ee Oe 
the adjoint variation problem of the problem (2). 
A surface 
(8) x = x(u, v), y = y(u, v), z= 2(u, v) 


of class C’’ which satisfies the Euler-Lagrange equations 


0 0 
— (Yo Be — 2, Pg) + (2. Pp = Yu Pc) - 0, 
Ou Ov 
0 re] 

(9) — (z,®4 der Ly Pe) + (xy Po — 2,24) = 0, 
Ou ov 


) te] 
—- (x, Bp = Y»P,) + (Yu Pa Sees Ly, Pz) a 0 
du dv 


is called an extremal surface of the variation problem (2). These equations 
are a necessary and sufficient condition that there exist three single valued 
auxiliary functions 

2Z(u, v) 


(10) E(u, v), g(u, v), 


of class C’’, uniquely determined up to additive constants, which together with 
the functions x(u, v), y(u, v), z(u, v) satisfy in G (see equation (2)) the system 


of equations 


5 See, for example, O. Bolza, loc. cit., pp. 663-667. 








fe: 





oint 


the 
in a 
ery- 


ions 
lued 


with 
stem 











ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 211 


d= = 6, dz — c dy, 
(11) dj = ce dx — ®, dz, 
dz = @,dy — , dx. 
Haar called the surface 
(12) z= Fu, 9), y = Gu, v), z= Z(u, v), 


determined by the auxiliary functions (10), the adjoint extremal surface of (8) 
and showed that it is an extremal surface of the adjoint variation problem (7). 


0.3. Extremal surfaces of non-parametric variation problems. An extremal 
surface of the problem (1) has been defined as a surface of class C” which satis- 
fies the Euler-Lagrange equation (5). That is to say, an extremal surface 
satisfies a condition which must necessarily be satisfied by a surface if it is a 
minimizing surface of the problem (1). By considering variations on the de- 
pendent variable, Haar [2] has shown that if a surface z = 2z(x, y) of class C’ 
furnishes a minimum for the integral (1) for prescribed boundary conditions 
then the third equation of (6) must be satisfied by z(2, y) and an auxiliary func- 
tion ¢\z, y). By considering variations on each of the independent variables, 
Radé [3] has shown that if a surface z = z(x, y) of class C’ furnishes a minimum 
for the integral (1) for prescribed boundary conditions, then the first two equa- 
tions of (6) must be satisfied by z(2, y) and two auxiliary functions &(2, y) and 
n(x, y). In this paper we shall call a surface of class C’ which satisfies these 
necessary conditions for a minimizing surface an extremal surface. That is to 
say, a surface z = 2(x, y) of class C’ shali be called an extremal surface of the 
variation problem (1) if there exist three single valued functions £(2, y), n(z, y), 
¢(z, y), uniquely determined up to additive constants, which together with the 
function 2(x, y) satisfy the system of equations (6). We shall refer to the system 
of equations (6) as the Haar-Rad6 system of equations of the variation problem 
(1) and to the functions &, 7, ¢ as the Haar-Rad6 auxiliary functions correspond- 
ing to a given extremal surface. 


0.4. An associated variation problem related to the Hamiltonian function. 
If variables 7, x are introduced by means of the transformation 
= —X(p,q), «= —Y(p,q), 
the Hamiltonian function of the problem (1) is defined by the relation 
H(x,«) = —Z(p, q). 


By making a transformation similar to the above transformation defining the 
Hamiltonian function, the author’ has shown that it is possible to define an 


‘E. J. Mickle, Hamiltonian and quasi-Hamiltonian functions associated with double 
integral variation problems, a doctoral dissertation written under the supervision of Pro- 
fessor Lincoln LaPaz at The Ohio State University in 1941. 








212 EARL J. MICKLE 


associated variation problem in the following way. Assume that A(p, q) is 
everywhere different from zero in a region S of the pq-plane and that the trans- 
formation 


Pi = Y(p, q), qj = X(p, q); Fi(p., qn) = Z(p, q); 
Ti: Apr, 1) 
Pi, 71 
Ai(p, g) = =A +0 
i(p, q) a(p, 9) 


carries the region S in a one-to-one and continuous way into a region S, of the 
Pigi-plane in which F;(p; , q:) is defined. If &(x, y), n(x, y), ¢(x, y) are the Haar- 
Rado auxiliary functions corresponding to a given extremal surface z = 2(2, y) 
of the variation problem (1), the transformation 


Ax, y1) 

mH=2% W=y Alm, mw) = S(x,y), Ale] = = = 1, 

ti[z}: A(x, y) 

f(t, yi) = &2,y), mlm, ym) = a(z,y), falas, i) = 2(2, y) 

determines an extremal surface z; = 2:(21 , y:) of an associated variation problem 

0z1 021 

‘ dial = [fF ' dx, d = - a = . 

(13) il 1} ; (pr q1) i1dyi, Pi dz,’ 71 ay: 

Ri 

considered in an (x1 , y: , 21)-coordinate system, for which &(x , y:), m(a1, y1), 

f:(%1 , y:) are the corresponding Haar-Radé auxiliary functions. 


0.5. Associated variation problem of Rad6. Subsequent to the completion of 
the dissertation referred to in §0.4, T. Rad called to the attention of the author 
that in some unpublished notes he had investigated in addition to the variation 
problem J;[z] a third associated variation problem defined as follows.’ Assume 
that F(p, q) and Z(p, q) are everywhere different from zero in a region S of the 
pq-plane and that the transformation 


q p l 
.- “a ’ eq = =— \? F.( ’ 2) ee ee 5, 
-. ‘ Fa)’ =” F@,@) seapia F(p, 9) 
2 Ape, q2) _ Z 
As(p, q) = a .) = — 0 


carries the region S in a one-to-one and continuous way into a region S, of the 
poge-plane in which F2(pe, ge) is defined. If E(x, y), n(x, y), (x, y) are the 
Haar-Rad6é auxiliary functions corresponding to a given extremal surface 
z = 2(z, y) of the problem (1), the transformation 


te = (zx, y), yo = (2, y), Zo(X2, yo) = 2(z, y), 


te{z]: baz] = oe we = FZ #0, 


fo(r2, yo) = 2, n(%2, Y2) = Ys f2(x2, yo) = F(z, y), 


7 The author wishes to express his appreciation to Professor T. Radé for the use of these 
notes and for his suggestions during the preparation of this paper. 





oo --—> - ~ aa aa _~ ais 





}) is 
ans- 


, Yi), 


on of 
ithor 
ation 
sume 
f the 


f the 
» the 
rface 


f these 








ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 213 


where it is assumed that the interior of the region R is carried in a one-to-one 
and continuous way into the interior of a region R2 of the x2y-plane, determines 
an extremal surface 2. = 22(%2 , y2) of an associated variation problem 

dz dz 
(14) J2{z2] = I! F2(p2, q2) dx2 dys, ee —, a = iy: ’ 


Re 


considered in an (x2, Y2 , 22)-coordinate system, for which £(x2 , ye), m2(22 , Ye), 
f2(%2 , Yy2) are the corresponding Haar-Rad6 auxiliary functions. 


0.6. Some further associated variation problems. The transformations 
T;, 7 = 1, 2, 3, defining the variables p; , g; and the functions F;(p; , q;:) are 
of the form 


a= ail X, if Z, P, F), aos B(X, ie Z, P, q F), 


- (pi, 4) 


Fi iy i) = vil X, | - Z,; ’ Fp Ai( ? ) = = 0 
Pi, g P,q Pq a(p, 9) 





and the transformations ¢,[z], 7 = 1, 2, 3, are of the form 

ai = A(z, y,2,P,q), Yi = Oi(x, y, 2, p,q), 2(ti, Yi) = Ci(Z, Y, 2, D, Q), 
(16) Ei(xi, yi) = d(x, y, 2, P,Q), mri, Yi) = Cs(, Y, 2, D, Q), 

Fi(xi, yi) = fil, y, 2, PD, ), 

5[z] = A(x; , y:)/A(z, y) ¥ 0, 


where a; , b; , c;, di, €:, f; are in some order the six functions 


zy zy 
rye, | Yde-Zay, [Zax - Xaz, 
zly! zl,yl 
(17) 
zy 
/ Xdy — Ydz 
zlyy! 
with dz = pdx + qdy. When evaluated on an extremal surface z = 2(z, y) 
of the variation problem (1), the functions (17) reduce to the functions 


(18) x, y, 2(x, y), E(x, y), n(x, y), (x, y), 


where £, », ¢ are the corresponding Haar-Radé auxiliary functions. The ques- 
tion arises as to whether there are any further transformations of the type (15) 
defining an associated variation problem and a corresponding transformation of 
the type (16), using the six functions (17), such that the transformation ¢,{z] 
when evaluated on an extremal surface of the problem (1) determines an ex- 
tremal surface of the associated problem and the corresponding Haar-Radé 
auxiliary functions. In Part I we give twenty-four such transformations, in- 
cluding the three already mentioned, which fulfill these conditions. The trans- 
formations 7; have the further property that they form a group of order 
twenty-four. 








214 EARL J. MICKLE 


In Part II we give eight variation problems, including the adjoint variation 
problem of Haar, such that each of the eight surfaces 


z(u, v) y(u, v) ny v) 
(19) z=4or y =4or z=j0r 
E(u, v), gu, v), | 2(u, v), 


where z(u, v), y(u, v), z(u, v), #(u, v), 9(u, v), Z(u, v) are functions satisfying the 
system of equations (11), is an extremal surface of one of these associated prob- 
lems. The eight transformations defining the associated variation problems 
also form a group. 


PART I 


Associated Non-parametric Variation Problems 


1.1. Twenty-four associated variation problems. We shall assume that the 
integrand function F(p, q) of the variation problem (1) is of class C‘”, n = 2, 
in an open region S of the pq-plane. Consider the twenty-four transformations 


(1.1) Ti: pr= —-Yip,gd, G@=Xilp,Q, Filvi, a) = Zl, Q, 


where j = i + (—1)', 7 = 0, 1, 2, --- , 23, and where the functions X ,(p, q), 
Y,(p, q), Z;(p, q) are defined in Table I below in terms of the functions X(p, q), 
Y(p, q), Z(p, g) given in (3) and the functions p, g and F(p,q). We shall assume 
for each particular transformation that in the region S the quantity 








a( i> Gi - P . ‘ 
(1.2) Ai(p, q) = i an = Xie Yie - Xie Yin; g@t + (—1)', 
is everywhere different from zero and that the transformation 7; carries the 
region S in a one-to-one and continuous way into a region S; of the p.q,-plane 
in which F;(p; , g:) is defined. The values of the A;,(p, q) are given in Table I 
where A(p, q) has the value given in (3). 























TABLE I 

i}/o/}/ 2/4] 6 s | 10 | 12 | 4] 16 | 18 | 2 | 22 
re © as SA Rhee Re Phases Seeal Baad Rel Bes. 
Hl Y 1 Y | Z 1 Se : x Z 

sos eat We Gee et ae ee Oe he z | Z y| Xx 
x Z 1 | 1| x| ¥ 1)/Zz/Y 

MY) g | * |) x) 7x] 2% | -¥|-¥| | 2] y | x 
1 ae ee i-) 2 Ps ae ill 

sod tad a id :1"Si* ALAR Eee seas 
Sa a OS | | A SAP ihe a MM... 
Z te 1 zizfsiby|{x 

a} 1} = |-=| =] -x|/-=|- yj--|/=/{;=/<= 
F3 q i | p* p* p* q@ F3 F3 











yn 


he 
b- 
ns 


the 





ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 215 


TABLE I (continued) 














| | 19 21 | 23 
ele | & |ti 2] 7/_.} 2 | Fi tis 
_| F q q P Pp Pl» q| F| F 
1 1 F 1 F 1 
Y;| -—p q _— _-— q 9g _ F ae me | _-— 
F q q Pp Pp Pp q | F F 

1 I Pp F 1 1 
FT Ae es 2 ee el le J ee t q * —;}i,& 
F q q p Pp Pp q F F 
Fa gA A pd | A pA | @A Fa | FA 
a) 4) a [Mi ye) oe || yl plow) mw | | x 





The equations defining Fip,, Fig;, Fi — piF ip; — iF iq; can be found as 
follows. Since in the transformation T;, Y;(p, q) and X,(p, q) are of class 
c‘"», it follows from implicit function existence theorems that p and q as func- 
tions p; and q; are of class C‘". Therefore, 





dp; = —Yipdp — Y iq dq; dq; = Xjpdp + Xj, dq. 
From these equations, 
_ Xiq Yi - a Xip ‘ip 
oe ars 9° -7 es 
Thus, 
Op Xn, OP _ Yin. OF _ _Xi, 8 _ _Yi 
OD; A; . 0g; A; F Op; A; : 04: A; 
and hence 
0 0 > . 
Fi; = Zip M + Zig = (XijqZip = X jp Ziq) / Mi ; 


0 0 - 
Fig; = Zi» “ + Z; of (YieZip = V ip Ziq) /Ai. 


a@ 04; one 
From Table I and the relation (1.2), we obtain 
(1.4) Fs», = —X;, F i, = -—Y;, F; —_ DiF ip, = QiF ig; = Zz: ° 
For example, for i = 3, X2 = Yis, Z2 = Xs, Y2 = Xs, Z2 = Yis. Sub- 
stituting these values in (1.3) and using (1.2) and Table I gives 


= —Y3. 


A Pp , q 
Fi, = = =-3° —X;, Fu, = Zerg 


From these equations and the transformation 7’ , 
F; — DsF 3p, - QsF 39; = Z, — YoX3 + Xo¥3 
= —(F + pX + qY)/FZ = —1/F = Z;. 








216 EARL J. MICKLE 


The equations (1.4) show that F,(p; , q;) is of class C™ in S;. 
We shall call the variation problem 


5) Jded=[f[ Rin,qandy, p=, w=, 
A 

considered in an (x;, yi, 2:)-coordinate system, the 7-th associated variation 

problem of the original variation problem (1). 

We note that the problem Jo[z] is the problem (1) itself, the problem J,[z] 
is the problem given in §0.4, the problem J2[z.] is the associated variation 
problem of Radé, and the problem J;[z3] is the adjoint variation problem of 
Haar. 


1.2. Extremal surfaces of the associated problems. In Table II below we 
give twenty-four transformations of the type (16) evaluated on an extremal 
surface z = 2(z, y) of the problem (1) with &(z, y), n(x, y), ¢(z, y) as the cor- 
responding Haar-Rad6 auxiliary functions. The assumption that A;(p, q) ¥ 0 
in S, that the transformation 7; determines a one-to-one and continuous mating 
between S and S; , and that the function F;(p; , q:) is defined for every point of 
S; implies that 3,2] = 224) 42 0 in R for (2, =) in S. In Table II it is 


d(x, y) 
to be noted that 6;{z] = 6,[z] for 7 = i + (—1)'. We shall assume in each case 


that the transformation ¢,[z] carries the interior of the region R of the ry-plane 
in a one-to-one and continuous way into the interior of a region R; of the xy, 
plane in which the functions z,(x; , yi), (ai, ys), ni(ai, ys), Ci(ti, Ys) are de- 
fined. From implicit function existence theorems, x and y as functions of 2; 
and y; are of class C’ and hence so are 2; , & , 9; and ¢; . 

TABLE II 











i 061/23) 4) 5 6! 7/8 | 9| 101112113114115| 16 17:18119|20| 21! 22.23 
me faaelelelelzlzlzleieyloelelyvelelaaeeclale 
we lolvlaelelealelzlaeleloolelsielelelelel eles 
adesw= le{tleitiylaciz alyazlelseleicaly ecu a ze 
~et.wd= lelelzezlrisielelelelalaaaaarcisiyvelelyly 
ae w= [aaiyy el elyly le lee] sta mele|z(2(g¢giaele le 
7 f(x,y) = the hele ye le y r El zz) zely alae nye Zz 
~ ls) = “4 | FZ | -q| —eX| —-X| -—p|py| Y | —pz| qZ| FY | FX 


The Haar-Rad6 system of equations (6) can be written 
dz = pdx + qdy, 

dt = Y dz — Zdy = —pdt — F dy, 
dn = Zdx — Xdz = F dz — qd, 
dg = X dy — Y dz. 


(1.6) 














ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 217 


If the values of dz, dy, dz, dt, dn, df as determined by the transformation ¢,{z] 
are substituted in (1.6), this system of equations can be rewritten in the form 


dz, = X;dyi— Yidui, j=it+(—-1), 
dt; = Yidzi — Zidy:, 
dni = Zi dx; — X, dz, 
di; = Xidyi — Yidz;, 


(7) 


where, as defined in Table I, X;, Y;, Z;, X; and Y; are functions of p and q 
and are evaluated on the extremal surface z = z(x, y) defining the transforma- 
tion ¢,{z] with z and y as functions of zx; and y; . 
For example, if i = 3, then from Table II, dr = dé; , dy = dn; , dz = dé;, 

dt = dz; , dn = dy; ,d¢ = dz;. Substituting these relations in (1.6) gives 

djs = pdts + qdm, 

dx; = —pdz,; — F dy, 

dys = F dt; — qdz;, 


dzs => X dns — Y dé; ° 


(1.7) 


From the third and second equations of (1.7) we obtain respectively on using 
Table I 


dts = 4 dex + py = Vode — Sede, 
1 p , 
dns = — pits —_ P dz; = Z3dx3 — X3 23. 


Substituting these values of dé; and dn; in the first and the fourth equations of 
(1.7) gives respectively on using Table I 


1 
dts =p (Z dz + dys) — a(} dxs + P aes) = dys aaa pars 
= X3 dys = Y3 dz, 


_{1 . 1 
—X (;; dz3 + P des) — } (3 dz3 + pan), 


dzs3 


From +this last equation we obtain 
(F + pX + qY) dz; = —Y dy; — X dz;, 
which gives, since Z = F + pX + @Y, 


» x . 
dz3; = -5 dys — Z das = X2dys — Yodzs. 


If when p and q are functions of z; and y; the system of equations (7) is satisfied 
by four functions z,(z; , ys), &(zi, yd, nil(ti, yd, F(2i, ys), the inverse of the 








218 EARL J. MICKLE 


transformation ¢t,{z] will determine four functions z(z, y), &(x, y), n(x, y), (x,y) 
which satisfy the system of equations (1.6). That is to say, the surface 
z = 2(x, y) is an extremal surface of the variation problem (1) on which 6,{z] ¥ 0 
and for which &, n, ¢ are the corresponding Haar-Rad6 auxiliary functions. 
If the relations given by the transformation 7; and (1.4) are substituted in 

the system of equations (7) this system may be rewritten in the form 

dz; = pidxri + qidyi, 

dé; = — Fis, dz; — (F; = DiF ip, — gi Fig,) dyi, 

dni = (F; = Fi, eS Fig) dxi + Fin; dz, 

df; = —Fi,, dyi + Fig; dxi. 


(7) 


This system of equations is the Haar-Radé system of equations for the 7-th 
associated problem. These results may be summarized as follows. 

The transformation t,|z| taken on an extremal surface z = z(x, y) of the variation 
problem (1) on which 6,|z| # 0 determines an extremal surface 2; = 2;(x; , yi) of 
the i-th associated problem J ;{z;| for which &(2; , yi), ni(xi, ys), Fi(i, yi) are the 
corresponding Haar-Radé auxiliary functions and, conversely, the inverse of the 
transformation t;{z] taken on an extremal surface 2; = 2:(x; , yi) of the i-th asso- 
ciated variation problem determines an extremal surface z = 2(x, y) of the original 
problem (1) on which 6;{z] # 0 and for which (x, y), n(x, y), ¢(x, y) are the cor- 
responding Haar-Radé auxiliary functions. 


1.3. Group property of the transformations 7;. In §1.1 each transformation 
T; is considered independently of the other twenty-three transformations. In 
this section we shall assume that in the region S the quantities A;(p, q) are 
everywhere different from zero simultaneously. From Table I this is seen to 
be equivalent to the assumption that in S the quantities X, Y, Z, F, p, q, 4 
are everywhere different from zero. From Table I, the transformations 7; and 
the relations (1.4), this assumption is seen to imply that in the regions S; 

X; ~ 0, Y; + 0, Zi ~ 0, pi ~ 0, Gg ~ 0, F; + 0, 


(1.8) : i 
— Fipiq; = 4; # 0, j=it(—-1),7 = 0, 1, 2, ---, 23. 


ipips P igias 
The functions X,(p, q), Yi(p, q), Z:(p, q) defined in Table I are of the form 
X; = a;(X, Y,Z,p,q,F), Y:=B(X,Y,Z,p,9,F), 2: =vid(X, Y,Z, p,q, F). 
By the transformation 7,7; , we shall mean the transformation 
Pri = —B(Xe, Yu, Ze, Pe, MW, Fr), 
(1.9) Qui = @(Xe, Ye, Ze, De, Me, Fr), j=it(-)) 
Fy = y(Xe, Ye, Ze, De,» Mey Fr). 





e 


F’). 











ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 219 

By (1.8) this transformation is always possible. We leave for the reader the 
verification of the following relations: 
(1.10) ToT; = T:T = T;, ~=0,1,2,---, 23, 
(111) T=, t=1,2,8, 18, 
(1.12) T,:T, = T;, T:T; = T2, T:T; = T1, 
(1.13) TisT i713 = Ti, i = 0, 1, 2, 3, 
(1.14) Tf; = T;, k = 13, 12,7, 6 for i = 0, 1, 2, 3 respectively, 
(115) T= To,T = T, 
(1.16) Telit, = Te, k = 13, 6, 2,7 for 7 = 1, 2, 3, 13 respectively, 
(1.17) T;T; = Tx, k = 5, 20, 21, 16, 17, 8, 9 

for 7 = 1, 2, 3, 6, 7, 12, 13 respectively, 
11, 22, 23, 14, 15, 18, 19 

for i = 1, 2, 3, 6, 7, 12, 13 respectively. 


(1.18) TwT; = T:, k 


From (1.10), (1.11), (1.12) it follows that the transformations 7; , 7 = 0, 1, 2, 3, 
form a group of order four which we shall call G;. From (1.11) Tj; = Tis 
and from (1.13) the group G; is invariant under the transformation 7);. There- 
fore, G; and Tis; generate a group G, of order eight whose elements by (1.14) are 
T;,7 = 0,1, 2,3, 6,7, 12,13. From (1.15) the inverse of the transformation 7, 
is Ty. By (1.16) the group G2 is invariant under the transformation 7; . 
Therefore, G. and 7; generate a group of order twenty-four whose elements by 
(1.17) and (1.18) are the transformations 7; , 7 = 0, 1, 2, --- , 23. 


1.4. Variation problems associated with the Dirichlet and the area integrals. 
The variation problems associated with the Dirichlet integral 


1 2 2 
(1.19) mel =5{[ @+@aray 
R 
by means of the transformations 7; are given below. The subscripts have been 
dropped from the x;, yi, 2i, Pi, Gi. The numbers after the problems refer 
to the transformations 7; giving the associated problem. 


(@) - te =5/[@+ ard, i= 0,1,2,3; 
(b) J(z] = I/ re dx dy, p ~£ 0,7 = 4, 9, 16, 21; 
A P 
2 
(c) J[z] = I/ +" dz dy, q ~ 0,7 = 10, 15, 18, 23; 


R 





220 EARL J. MICKLE 


(a) Jel = | [ (29 - ph drdy, — %q — p* > 0,8 = 8,17; 
R 

(e) J{z] = al (2p —q)idxdy, 2p —q¢>0,i = 11, 22; 
R 

(f) J(z) = [/ (2pq — 1)' dx dy, 2pq > 1,7 = 7, 12, 13; 
R 

(g) J(z] = I! (—2qg — p’)'drdy, 2+p' <0,i = 5, 20; 
R 


(h) Jz] = [| (—2p — q@)idedy, 22+ < 0,i = 14, 19; 


R 
(i) J(z] =| (—2pq — 1) dr dy, 2pqg < —1,i=6. 
R 
Similarly the variation problems associated with the area integral 


(1.20) sel = [[ ate +a) ardy 


R 


are 


(j) Jz] = [[ate+ a! ardy, i = 0,3, 4, 10, 21, 23; 
R 

(k) Jel = [ [ 0-p- a) ae ay, p+q <1,i=1, 2,5, 11, 20, 22; 
R 

Q  Jel=[f (== ardy, — p*- 4° > 1,1 = 6,9, 13, 14, 16, 19; 
R 


(m) J[z]= [| (@—p’—1)'dedy, g@—p'>1,i=7,8, 12, 15, 17, 18. 
R 


If the integrand function F(p, q) of the problem (1) is an analytic function of 
p and q in the region S and if an extremal surface z = z(z, y) defining the trans- 
formation ¢,|z] is an analytic function of z and y in the interior of the region R, 
then it follows from implicit function existence theorems that the extremal 
surface z; = z,(x;, yi) of the i-th associated problem is an analytic function of 
z; and y; in the interior of the region R;. Haar [2] has shown that an extremal 
surface of the Dirichlet integral (1.19) is necessarily analytic and Radé [3] has 








22; 


19; 


18. 


n of 
ans- 
n R, 
»mal 
n of 
mal 


| has 








ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 221 


shown that an extremal surface of the area integral (1.20) is necessarily analytic. 
We thus have the following results. 


If a surface z = 2(x, y) of class C’ is an extremal surface of one of the variation 
problems associated with the Dirichlet integral or the area integral, then z(x, y) is 
an analytic function of x and y in the interior of the region R and in R satisfies 
the corresponding Euler-Lagrange second order partial differential equation 


(a) r+t=0; 


(b) (1 + @)r — 2pqs + pt 
(ec) qr — 2pgs + (1 + p)t 


(d) 2gr — 2ps +t = 0, 
(e) r — 2gs + 2pt = 0, 


(f) gr — 2(pq — 1)s + pt 


(g) 2gr — 2ps —t = 0, 
(h) r + 2gs — 2pt = 0, 


= 0, p #0; 
= 0, q ~ 0; 


2g — p’ > 0; 
2p — ¢ > 0; 
= (0, 2pq > 1; 
2g +p <0; 
2p+q <0; 


(i) qr—2paqt+1)s+pt=0, pq < —1; 

(j) (1+ q)r — 2pgs + (1 + pt = 0; 

(k) (g — 1)r — 2pgs+ (p’ —1t=0, p+ <1; 
(l) (q+ Ir — 2pgs+ (p’ —1t=0, p—_>1]; 
(m) (¢ — 1)r — 2pgs+ (p+ 1t=0, g@—p >i. 


PART II 


Associated Parametric Variation Problems 


2.1. Some fundamental relations. For the variation problem 


(2.1) a ee / / (A, B,C) dud = min., 
where 
Yu 2u Zu Lu Lu Yu 
(2.2) A= ’ B= ’ C sal ’ 
Yr» ze Ze Ly ry Yv 


we make the assumptions 
(a) ® is of class C‘”’, n 


> 


2, in a star shaped region = of (A, B, C)-space.® 


8 A region = of (A, B, C)-space is said to be star shaped if for every number k > 0 the 
point (kA, kB, kC) is in = if the point (A, B, C) is in 2. 





222 EARL J. MICKLE 


(b) © is homogeneous of degree one with respect to its three arguments, i.e., 
for all k > 0, 


&(kA, kB, kC) = k®(A, B,C). 


The condition (b) implies that 


(2.3) m => A®, + Bo, + Cb-_ . 
Let 
(2.4) z = 2(u, v), y = y(u, v), z= 2(u, v) 


be an arbitrary extremal surface of the variation problem (2.1) and let 
(2.5) F(u, v), g(u, v), Z(u, v) 


be three auxiliary functions of class C’’ which together with the functions 
x(u, v), y(u, v), 2(u, v) satisfy the system of equations 


dt @, dz — - dy, 


(2.6) dj = -dx — ®, dz, 
dz = P, dy — P, dz. 


Since A, B, C are direction components of the normals to the surface (2.4), 
we have 
(2.7) Adzx+ Bdy + Cdz = 0. 

If in (2.6) the first equation is multiplied by @, , the second by , , the third 
by ®c, we obtain on adding 
(2.8) PD, dz + Pz dg + Po dz = 0. 


If in (2.6) the first equation is multiplied by B and the second by —A, 
we obtain on adding and using relations (2.3), (2.7) 


(2.9) Bdt — Adj — dz = 0. 
Similarly, 

(2.10) C d= + dy — Adz = 0; 
(2.11) dz — Cdj + Bdz = 0. 





From (2.6) we have 


Ly = Dez, — Deyn; Gu = Poly — Pa2u,; Zu = Ba Yu — Pe @x}; 
(2.12) 


t= D3 2, = Peo Yo; Gv = cI, — $420, z= D4 Yr 7 Dz Ly. 








ons 


Lird 











ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 223 


From (2.12) we can compute the further relations 


Zu Tu Zu Yu Gu Zu 

“| = Abe, | = —Ad,, = —Béc, 
fy vy ly Yo Yo 2» 
u Gu Yu Zu Zu Lu 

(2.13) = B®, ’ = Co, ; = —C, ’ 
Uy Yo Yv Zr | ey Ly 
Fu Zu Zu Ty Lu Yu 

- a = PP, - % — Ph, ’ F = PP, 

y ey Zy Vy Uy Yr 


2.2. Transformations using functions homogeneous of degree one. If 
a(A, B, C), 8(A, B, C), y(A, B, C), 6(A, B, C) are four functions homogeneous 
of degree one and if the transformation 
(2.14) 7: A = a(A, B,C), B = B(A, B,C), C 

i al oe a(A, B, 
(A, B,C) = 6(A, B,C), 

( ) = 6 ) aA, B, 
carries a star shaped region = of (A, B, C)-space in a one-to-one and continuous 
way into a region = of (A, B, C)-space in which @ is defined, then © is star 
shaped and @ is homoge neous of degree one. 

Proof. If a point (A°, B’, C*) in © is mated to a point ( ° BC’) in >, then 
the point (kA’, kB’, ke) for k > Ois in = and is mated te ot point (kA°, kB’, 
kC°). This follows from the fact that since a, 8, and y are homogeneous of 
degree one 

a(kA°, kB’, kC°) = ka(A°*, B’, C°) = kA’, ete. 
From the fact that 6 is homogeneous of degree one 
6(kA°, kB, kC°) = 8(kA®, kB’, kC®) = k8(A°, B°, C°) = kb(A®, B,C). 


Therefore, = is star shaped and @ is homogeneous of degree one. 


2.3. Eight associated variation problems. Consider the eight transformations 
T; of the type (2.14) given in Table III below. We shall assume that in the star 
shaped region = of (A, B, C)-space in which is defined, the quantity 0(A; , B;, 
C;)/a(A, B, C) is everywhere different from zero and that the transformation 7; 
carries the region = in a one-to-one and continuous way into a region 2; of 
(A;, B;, C;)-space in which $,(A;, B;, C;) is defined. Since the function 


#(A, B, C) is homogeneous of degree one, the functions defining the transforma- 


tion 7’; are homogeneous of degree one. Therefore, by the results of §2.2, the 


224 EARL J. MICKLE 


region 2; is star shaped and the function ; is homogeneous of degree one. 
equations defining ®,,; , ®js, and $;¢, in Table III are found as follows. 


#? 


TABLE III 


9(A;, B;, Ci) 


#;(A;, B; , C;) (A,B,C) 


Pp 1 


A 2 (Sep Pec 
BAO44 Pec 


C(@44 P58 


Cd 


bb (S44 cy )| 4 
A @ AA®BB _ 6 


If A, B, C are considered as functions of A; , B; , C; according to the inverse 
of the transformation 7’, , then, since they are of class C‘"~”, we have 
dA, _ dA; ab, = Po dA, oe A d®_ ; dc = —, dA, —_ Ad®, ° 
From these relations we obtain 


(2.15) Cd& = dB, - > 


_ Boy 


dA\; Bd, == 4 


dA, 


and from (2.3) 
(2.16) A db, + Bd, + Cd = 0. 
Using (2.15) and (2.16) gives 
d®, = &,dA; + Ad®, = dA; — (Bd, + C dc) 


Boe, ¢ C&- 
dds — 5 dB + 


} C B 
= A dA, — A dB, a A dc. 


@,dA; + * dCs + dA, 








The 


verse 








ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 225 


Therefore, 


C B 
Piz, _ ~ A? Pic, = za 
In a similar manner we obiain the remaining relations given in Table III. 
These relations show that #;, is of class C‘” in 3; . 
We shall call the variation problem 


a 
Pia, _ 4 ? 


I {z;, Yi; zi] _ I (Aj, B;, C;) du dv, 
a G 

(2.17) 

Yiu Ziu Ziu Liu Tiu Yiu 

Ai = B; => ; C; = ; 

Yiv Ziv Ziv Liv Liv Yiv 
the 7-th associated variation problem of the original problem (2.1). 
(Re note that the 0-th associated problem is the problem (2.1) itself and the 
seventh associated problem is the adjoint variation problem of Haar. 


2.4. Extremal surfaces of the associated problems. In Table IV below for 
convenience of reference we rename the six functions x(u, v), y(u, v), z(u, v), 
E(u, v), J(u, v), Zu, v) given in (2.4) and (2.5). 








TABLE IV 
Jol}1}2/3/4/5/6)7 or 0/1/2/3|/4/8|/6/7 
ai (u, v) ‘aiaizlzl2lelel2 #;(u, v) ‘tlhelelelelelate 
= yi(u, v) yiyi9 y 9yagy7 i (u, v) 99 vidlylolyiy 
2i(u, v) ‘eleleisial bias 2; (u, v) rel atalelsisisis 


by the relations (2.13) and the transformation 7; we have on using the notation 
of Table IV 
Yiu iu Ziu = Tiu Tiu Yiu 


(2.18) A; ’ 
Viv Ziv Ziv Tiv Ti» Yiv 


] 

& 
| 
2 
] 


Using the relations of Table III in the equations (2.6) to (2.11) and using the 
notation of Table IV, we obtain 


di; = Dig, dz; — Pic, dy, 
(2.19) dj; = Pic, dx; — i4, dz;, 
dz; = D4, dy: _ Piz, dz;. 


From the remark made in §0.2 that the cross differentiation test for exact 
differentials applied to the right sides of (2.19) reduces to the Euler-Lagrange 
equations, we have 





226 EARL J. MICKLE 


The surface 
(2.20) x = 7;(u, v), y = y;(u, v), z = z;(u, v) 


is an extremal surface of the i-th associated variation problem provided that for 
t= 1,2,---,7,A,B,C,&c, Os, 4 , ® are respectively different from zero every- 
where on the given extremal surface (2.4). 





The transformations 7’; are of the form 


A; = a (A, B, C, ®, ®, ’ d, ? ®-), 


& 
| 


= BA, B, C, g, P, , b, ; ®-), 
(2.21) 
C; = vi(A, B, C, &, ®, ) ?, ’ ®-), 
(A; ’ B; ? Ci) _ 6,(A, B, C,; ®, , ’ , ’ Gc). 
By the transformation 7,7’; we shall mean the transformation 


(2.22) TT; : Ayi=a, Bauc=Bi, Crys =vi, Pes = 5, 


where the arguments of a; , 8;, yi, 5; are (Az, Be, Ce, Pe, Pea, , Pea, » Pec,)- 
From Table III it follows that 


(2.23) ™%=T, fori =0,1,2,---,7. 
Therefore, an extremal surface of the i-th associated problem determines an 
extremal surface of the original problem (2.1). 


2.5. Group property of the transformations 7;. If we assume that the 
quantities A, B, C,, 4, @s , 8c, PrsPan — Pas, PasPoo — Pac , PexPoc — Pec 
are everywhere different from zero in 2, then the eight transformations 7’; are 
possible. We leave for the reader the verification of the relations 


(2.24) TT. = T : TiT, = T> ’ ToT, = T; ; 
(2.25) T3TiT3 = T; ’ i= 0, CR 2; 4; 
(2.26) T7371 = Ts ’ T3T2 = Ts ’ T3T; = T; ° 


From (2.23) and (2.24) it follows that the transformations T; , 7 = 0, 1, 2, 4, 
form a group of order four. From (2.25) this group is invariant under the 
transformation 7;. From (2.26) it follows that this group and the transforma- 
tion 7’; generate a group of order eight the elements of which are the transform- 


= 


tions T;,, + = 0,1, 2,---, 7. 


Tue Onto Strate UNIVERSITY. 








t for 
very- 


Pic) 


S an 


the 

2 
- Bac 
; are 


2, 4, 
> the 


rma- 
orm- 








9 


3 


ASSOCIATED DOUBLE INTEGRAL VARIATION PROBLEMS 227 


BIBLIOGRAPHY 


_ A. Haar, Uber adjungierte Variationsprobleme und adjungierte Extremalfléchen, Math. 
Ann., vol. 100(1928), pp. 481-502. 

_ A. Haar, Uber die Variation der Doppelintegrale, Journal reine angew. Math., vol. 
149(1919), pp. 1-18. 

. T. Rav6, Bemerkung tiber die Differentialgleichungen zweidimensionaler Variations- 
probleme, Acta Litt. Sci. Szeged, vol. 2(1925), pp. 147-156. 








FUNDAMENTAL THEOREMS OF A NEW MATHEMATICAL THEORY 
OF PLASTICITY 


By W. PRAGER 


1. Introduction. The mathematical theory of plasticity was inaugurated in 
1871 by B. de Saint Venant. The progress it has made since then is much 
smaller than that of the mathematical theory of elasticity in the period of 
almost equal length between Cauchy’s fundamental researches and the first edi- 
tion of Love’s treatise. The main reason for this comparatively slow progress 
seems to be the tremendous mathematical difficulty arising from the assump- 
tion that the material will not behave in a plastic manner unless a certain 
invariant of the stress tensor has reached a given critical value. Alongside 
plastic regions we will, therefore, have low-stressed regions in which the ma- 
terial is not yet plastic and behaves elastically. Two different sets of equations 
are valid in the plastic and the elastic regions and the problem becomes all the 
more involved by the fact that the boundary between these regions is not 
known beforehand but has to be determined so as to secure continuity of 
stresses. 

In order to avoid this great difficulty the author has proposed stress-strain 
relations which give a gradual transition from the elastic to the plastic state 
(Proc. 5th Intern. Congr. Appl. Mech., Cambridge, Mass., 1938, p. 234). Ina 
recent paper the simplest of these stress-strain relations has been applied to 
various problems of plane strain (Revue Fac. Sci., Univ. Istanbul, ser. A, 
vol. 5, p. 215). The present paper contains two variational principles which, 
in this new mathematical theory of plasticity, play the same réle as Castig- 
liano’s principle and the principle of least work do in elasticity. 


2. Stress-strain relation. As long as an elastic region subsists, strains in the 
plastic regions will be of the same order of magnitude as those in the elastic 
region. As in elasticity we assume these strains to be infinitesimal. 

Let ox and ¢, be the components of the tensors of stress and strain with 
respect to a set of rectangular axes. In order to simplify our equations we shall 
assume the material to be incompressible. Adopting the summation convention 
for repeated indices, generally used in tensor calculus, we write the condition 
of incompressibility in the form 


(1) Epp = 0. 
Introducing 
0 if i ¥ k, 
bu = : 
1 ifi = k, 
and defining the mean normal stress as 
(2) = 3opp ’ 


Received December 15, 1941. 
228 





ORY 


ted in 
much 
iod of 
st edi- 
ogress 
sump- 
ertain 
ngside 
e ma- 
ations 
ull the 
is not 
ity of 


strain 
state 

Ina 
ied to 
er. A, 
vhich, 
‘astig- 


in the 
slastic 


with 
» shall 
ntion 
dition 








NEW MATHEMATICAL THEORY OF PLASTICITY 229 


we write the components of the stress deviator in the form 
(3) Sik = Cie — TOR. 


The stress-strain relation used in this paper involves, besides the components 
siz the derivatives with respect to time, § and é. Denoting the modulus of 
rigidity by G and the yield stress in pure shear by p, we adopt the following 
stress-strain relation 


2Géx, if W = opeép < 9, 
seal Ww ia 
| 2G €ik — => Sik ’ ifW > 0. 
. \ 2p? 
W is the rate at which work is done per unit volume. The sign of W can, 
therefore, be used as a criterion whether, at the moment and point under con- 


sideration, we have loading or unloading of the material. If W < 0 (unloading), 


(4) 








Fie. 1 


we have to apply the first of the relations (4). This is the stress-strain relation 
of an incompressible elastic body written in a form which does not involve the 
stresses and strains directly, but their derivatives with respect to time. If 
W>o (loading), we have to use the second of the relations (4). As this is 
homogeneous with respect to the derivatives §, and é: , the velocity of a deforma- 
tion will have no effect on the stresses produced. All viscosity effects have, 
accordingly, been left unconsidered; our treatment is concerned with plastic 
effects only. 

In alternate tension and compression the material characterized by the stress- 
strain relation (4) will give load-deformation curves like those of Fig. 1. These 
curves are similar to those found with copper. 





230 W. PRAGER 





The main advantage the stress-strain relations (4) have over those adopted 
in the classical theory of plasticity consists in the fact that, as long as the ma- 
terial continues to be loaded everywhere, we shall have one and the same stress- 
strain relation valid in the low-stressed regions where the material behaves 
almost elastically as well as in those regions where the material flows under 
stresses which practically have attained the yield limit. 

The second of the relations (4) can be transformed in the following manner. 
Using the indices p, g instead of i, k and multiplying both sides by s,, we obtain’ 
SpqSpq = W(2p° — 8p¢8pq)- 

Solving for W and introducing into the second relation (4) the value thus 
obtained, we get 


Spa Spq 





— — Siz. 
2p” — SpqSpq 


(5) 2Géx = 8% + 


3. The boundary problems of plasticity. Let u; be the vector of displacement 
measured from a state of zero stress. We then have 





(6) > (Wik + Uk,i), 
where the indices after a comma are indices of derivation; e.g., 
” Ou; 
ik = ‘ 
OX} 


Let us consider a body consisting of a material with the stress-strain relation 
(4). Starting from a state of zero stress, let us deform this body by slowly 
increasing forces, applied to its surface, and thus build up a system of stresses 
ox Which will satisfy the condition of equilibrium o;,,, = 0. The force per 
unit area acting on a surface element with the (outward) normal n; is f; = cipnp . 

Assuming that the system oj is given, we ask for its rate of change ¢ 4, caused 
by a given rate of change f; of the forces acting on the surface. In dealing with 
this problem we shall assume that the given values f; are such as to lead every- 
where to a loading of the material. This assumption will be justified in most 
cases of practical interest. The problem thus stated will be referred to as the 
first boundary problem of plasticity. 

Again assuming that the system ox is given, we can ask also for the strain 
velocities é;, caused by displacing the points of the surface with given velocities 
u;. Here also we shall assume that the given velocities at the surface are such 
as to lead everywhere to a loading of the material. This problem will be referred 
to as the second boundary problem of plasticity. 


1 Owing to the incompressibility of the material, W can be written in either of the fol- 
lowing forms: 


W = op pa = Spa pq « 








pted 

ma- 
ress- 
aves 
nder 


ier. 
+1 
tain 


thus 


nent 


ution 
owly 
esses 
» per 
‘pip - 
used 
with 
rery- 
most 
; the 


train 
sities 
such 
rred 


e fol- 








NEW MATHEMATICAL THEORY OF PLASTICITY 231 


4. The first variational principle. With respect to the first boundary problem 
of plasticity we shall prove the following 


THEOREM. Among all systems ¢, satisfying the condition of equilibrium 
Gip,p = 0 and agreeing with the given rate of change f; of the forces acting on the 
surface, the system actually set wp minimizes the integral 


, 1 Om 
7 U = | A E ee. . er 
(7) 4G pa Spq T 20° at Spq Spe T 
extended over the total volume of the body. 


Indeed, keeping in mind that the stresses o;, are given, we obtain 


y 1 . Srs Sr . 
w= | x E TOF — sae, tn | ail 


| Enq 58pq AT, 


where é,;, is the system of strain velocities which, according to (5), corresponds 
to the system ¢,. Now, for the system ¢i actually set up by the given rate of 
change f; of the forces acting on the surface, the strain velocities é;, will satisfy 
the condition of incompressibility é,, = 0 and can, moreover, be expressed by 
the velocities wu; in agreement with (6). The integral (8) can, therefore, be 
written in the form 


(8) 


(9) 65U =} / (thy.g + thg,p)bG pq dr. 


Taking account of the condition of equilibrium and making the necessary 
assumptions about the regularity of the surface of the body and the continuity 
of the quantities wu; and é¢ and their derivatives, we can transform the integral 
(9) into the following surface integral: 


(10) 6U = | sign by dw. 


Now, the rate of change f; = ¢ipn, of the forces acting on the surface is given. 
We have therefore n,6¢;, = 0 and the integral (10) vanishes: 


(11) 5U = 0. 


Furthermore, 





s’U = | a bt Big + < S2eb8va)_ | dr > 0, 
p 


ui 
“ — SpqSpq 


as the invariant $p¢8p, cannot assume values greater than 2p. Indeed, equation 


(5) shows that with 8p¢8p, — 2p" we shall have 8yo8p~¢ = ; c (Spg8pq) — 0. 





232 W. PRAGER 





5. The second variational principle. The second boundary problem of 
plasticity leads to the following 


THEOREM. Among all systems of velocities u; satisfying the condition of in- 
compressibility tip,» = 0 and assuming the given values at the surface, the system 
actually set up minimizes the integral 


(12) Vea / 2G Ee — a (ore dr 
2p” 


extended over the total volume of the body. 


Indeed, keeping in mind that the stresses o;, are given, we obtain 


bV = / 2G E - oe Ong Ors | bépg dr. 


As we have éé,, = 0, this can be written as 
, ' 1 ' , 
6bV = / 2G E — a Seasrin 5€épq dr 


(13) 
= / 85q5€pg dt = 3 / Spq(Sthp.g + dtdg,p) dr. 


Here 8 is the rate of change of the stress deviator which, according to the second 
relation (4), corresponds to the strain velocities é,, . 

Now, the system ¢ actually set up satisfies the condition of equilibrium 
Gip,p = 0, which, with the help of (3), can be written in the form 
(14) Sip.p + C.i = 0. 

The integral (13) can be transformed as follows: 

6V = / Sng Ng Stty dw — / 8pq.q 5tty dr. 

On the surface we have 6u; = 0. Taking account of (14), we obtain 


BV = | é.pdtip dr 


= | en, ou, dw = | cdtip.y dr =0 


as the variations 6u, satisfy the condition of incompressibility du,,, = 0 and 
vanish at the surface. 
The second variation of V is 


eV = / G E béng — = (S5¢ sign) | dr. 
x 








o of 


f in- 
stem 


ond 


“jum 


and 





NEW MATHEMATICAL THEORY OF PLASTICITY 233 


According to Schwartz’ inequality, we have 
° . . 2 
SpeSpqb€rsd€rs Z (Spqd€pg) 


and consequently 
217 Yes ° SpqSpq 
eVv= ok _ sates | ar >0 
2p? 
since the invariant 8,8), cannot assume greater values than 2p’. 


6. Summary. Adopting stress-strain relations which give a gradual transition 
from the elastic to the plastic state, the author formulates the fundamental 
boundary problems of plasticity and gives the variational principles to which 
these problems lead. 


Brown UNIVERSITY. 











THE RECIPROCAL OF CERTAIN SERIES 
By L. Caruirz 
1. Introduction. This paper is concerned with properties of the coefficients 
in the reciprocal of series of the type 


(1.1) flu) = So (—1)' Stu" (Ao = 1), 


t=0 


where 
F; = (FP ’ [7] = 2” — Q, Fo - 1, 

and the A; are arbitrary polynomials in the indeterminate x with coefficients 
in GF(p"). (While convergence questions are of little interest here we remark 
that for 

deg A; < cip’ (c < 1), 
(1.1) converges for all u.) We denote the inverse of f(u) by A(u) so that 
(1.2) fA(u)) = u = A(f(u)); 
then in general we can assert only that A(u) is also of the form (1.1), that is, 
— A; p" 
t=0 F; ‘ 


, P e a ° ° 
where the A; are polynomials in x. This follows almost immediately from the 


(1.3) Mu) = 


recursion formula 


= : Pa Innit 
> (-1)"" A: Ai?" = 0 for m > 0, 


t=O F; Pe 
and the fact that the F-quotients are integral (that is, polynomials in x). For 
our purpose we shall require somewhat more, namely that A(u) is of the form 


(1.4) x(u) = 5 yr’, 
i=0 L; 
where the D; are integral and 
L; = 3) ; In = as 
this is equivalent to requiring that the A; in (1.3) is a multiple of F;/L; . 
We now put 
U < Bm n 
(1.5) — = —% (p" — 1| m), 
jw) = 2 ge ree 
where g» is defined by 
gm = Fy°Fy' «++ Fy 
Received January 9, 1942. 
234 





ients 


= 1), 


ients 
nark 


; 











THE RECIPROCAL OF CERTAIN SERIES 235 


and 


m=bo+ bp" +--+» + dp” Os < p"). 


Then our main result implies the decomposition in partial fractions 
(1.6) bn = Gu — eDt" DF, (p" # 2), 
P 


where G,, is integral, e and d are rational integers (p / e), the summation is over 
irreducible polynomials P of degree k, and k is an integer determined by m and 
satisfying certain conditions. If these conditions are not satisfied then (1.6) 
reduces to Bn» = G,, , that is, 8, is integral. For the special case A; = 1 the 
results of the present paper reduce to known theorems.’ The proof depends on 
properties of series of the type 
Ps 
(1.7) YP Cm -. 


m=0 Jm 


where the C,, are integral.” We call (1.7) a Hurwitz series. 


2. Preliminary results. Put 


(k) 
(2.1) p(y) = Ae", 
so that A“ is integral and indeed 
(2.2) Aw” =0 (mod Fi./Lx). 


Since by (1.2) 


u _ Mf) _ = Di pnk—y 
7" T*se 


it follows from (1.5) and (2.1) that 


(2.3) bmn = Am, 

E Ly 
the summation extending over all k such that p“ < m+ 1. From (2.3) it is 
clear that L.8, is integral, where s is the greatest integer such that p"’ < m + 1. 
However, if we make use of (2.2) we see that the k-th term in (2.3) is a multiple 
of F;,/Li and therefore it follows that the denominator of 8m contains only simple 
factors. 


1See L. Carlitz, An analogue of the Staudt-Clausen theorem, this journal, vol. 3(1937), 
pp. 503-517, and vol. 7(1940), pp. 62-67. These papers will be referred to as I and II, 
respectively. 

2 See I, §3. 





236 L. CARLITZ 


We next derive certain congruences satisfied by the A; and D;. From (1.1) 
and (1.4) we get the recursion formula 


(2.4) ) (‘| aver = 0 form > 0, 


i= 


*|- Fin m|_ Fn "|- ' 
1 Piz,’ 0 i m _ =e 


These coefficients occur in the polynomial 


(2.5) Vm(u) = Pit. | a ie |u = Il (u — A), 


t=0 deg A<m 


where 


' the product extending over all A (including 0) of degree < m. Now let P 
denote an irreducible polynomial of fixed degree k. If in (2.5) we take m = k, 
it is clear that y:(u) = uw” — u (mod P) so that 


‘| = (0<i<k) 
and 
H a= (—1)*" (mod P). 
0 
Now take m = k in (2.4) and get 
A. = D, (mod P). 


More generally, it follows from the product formula for ¥»(u) in (2.5) that 
Verm(u) = ure” — yr (mod P) 
which implies 
P + ") 

m 


Substituting in (2.4) we get the recursion 


(—1)*", ated FT (OSi<mm<i<m+hk). 
tv 


(2.6) Arn = A,D?"”. 
Repeated application of (2.6) gives 
(2.7) Atizm = AnD”, An = Di; 
this may also be written 
(2.8) Arizm = AwA'””, Ani = Aj. 
Now put® 

k—1 
(2.9) f = flu) = Ue (-1h, 

i= 


3 See II, p. 66. 








| 

















THE RECIPROCAL OF CERTAIN SERIES 237 


where 


oo BG ~ Aki+; a 
ol 


i=0 


By (2.8) this becomes’ 


= gf. td n(kitj) 
(2.10) fy AV (- IEE uh? = Asi, 
i=0 bitz 
say. Then we have 
n <—" - n(kitj+1) 
gy?” =>) (-1) r 
FR; 
° , AR pn i+jt+l1) 
= [7+ 1) > (-1) : pnlk af 
kit+j+l 
where as above [i] = 2” — x. Hence 
(2.11) oP” = [j + llein jG<k-—-1), 
while for j = k — 1 we have g1 = 0. Repeated use of (2.11) gives 
go 
7 
Substituting in (2.10) we get 
ps (0 <j < hy, 
so that (2.9) becomes 
(2.12) sa d(- 1) FF fr”. 


j=0 
nk 


If now we raise both members of (2.12) to the (p" — 1)-th power we evi- 
dently get 
fj” - ~~ 1 wn R, 
where R stands for the remaining terms in the expansion of the right member; 
clearly each term in R is a multiple of fj , where s = p™, and therefore R = 0. 
This completes the proof of the following 
Lemma. If the series (1.1) has an inverse of the form (1.4) then 


(2.13) pr fS (-1 1" F Au ,, aa (mod P), 


t=—0 





where P is irreducible of degree k. 


An A, 
‘ The notation > —u" = 7 — u™ (mod P) stands for the system of congruences A, = 
Ym Jm 


A; (mod P). 








238 L. CARLITZ 


It is now easy to evaluate A“ (mod P), defined by (2.1). In the first place, 
it follows at once from (2.13) that 


(2.14) A® =0 for p™ — 1 Xm. 
To expand the right member of (2.13) put it in the form 
(2.15) gg? ... grr. 
Then for 
m= Dy ap’ (0 < a; < p) 
it is clear that (2.15) will contribute to A“ only if 
(2.16) p —1|m 
and 
(2.17) Dd ainsi = p— 1 (0 <j < nk), 


It may be shown that (when (2.16) holds) (2.17) may be replaced by the simpler 
condition’ 


(2.18) > a: = nk(p — 1); 


thus we see incidentally that for given m there is at most one k. If now 
(2.16) and (2.18) are both satisfied, we get, using (2.7), 


(-1)" 1+dk+nk 


2.19 A® = Di, 
( ) II a;! k 


where 
d= >> ip? Qink+j - 
t+? 
If (2.16) and (2.18) are not both satisfied, then A “ = (0. Thus in all cases we 
have determined A“ (mod P) in terms of D, . 


3. The main theorem. Returning to (2.3) we have 


(3.1) Bm = > r A”. 
4k 


We have already seen that the denominator of 8,, contains only simple factors. 
Also, since A“ is a multiple of F,/L; it follows that if the term D;,A ® /Ly, be 

reduced to lowest terms, then (except for the case p" = k = 2) the denominator 
contains irreducible factors of degree k only. Now by the result at the end 


5 See II, p. 63. 











t place, 


- 14m. 


Li < p) 


< nk). 


simpler 


If now 


Ses we 


actors. 
/ Ly, be 
Linator 
1e end 








THE RECIPROCAL OF CERTAIN SERIES 239 


of §2, if there is no k satisfying both (2.16) and (2.18), then all terms in the 
right member of (3.1) are integral and therefore 8,, is itself integral. If, how- 
ever, such a value of k can be found, it is unique and the residue of A (mod P) 
is determined by (2.19) for every irreducible P of degree k. All other terms 
in the right member of (3.1) are integral. We may now state the following 


THEOREM 1 (p” # 2). Assume the series 


flu) = > (-1)4 al (Ap = 1) 


i=0 
has an inverse of the form (1.4), and put 


m= >) ap (0 S a; < p). 


If the system 


(3.2) > ai 


is inconsistent, then Bm is integral, while if (3.2) ts consistent, then k is uniquely 
determined and 


nk(p — 1), p™“ —1|m 


l 


(3.3) Bm _ Gm a eDi* > ’ 
rp P 
where Gm is integral, the summation is over all irreducible P of degree k, and 
oD (—1)"** ua ad 
(3.4) é= Ul ar P d= 2 tp Gink+;- 
The excluded case p" = 2 requires a more detailed examination of f°. We 
have the ne 
THEOREM 2 (p" = 2). If f(u) has an inverse of the form (1.4), then for 
9a D,A 
. a +1 1 _= Gn 2 42a 
(i) m + 1, 8 3 cg 
D;**" D:D: , DiDs 
oO: & oe, 
mes teti’ « "ser 
. 9a D, Ds 
(ii) m=2°+1(a@>0), Bm = Gu + EE OO 


in all other cases Bm is determined as in Theorem 1. 


As an immediate corollary of these theorems it follows that if H is an arbi- 
trary polynomial, then 


(3.5) Bm. = H(H™ — 1)Bm 


is integral. In particular 








240 









L. CARLITZ 








(3.6) Bn.z = a(x” be 1)Bm 


is integral. In the second place, if (3.2) is satisfied then it follows at once that 







(3.7) [k]8m = (x? — 2)Bm 
is integral and more generally 
(3.8) (H™ — H)Bn 






is integral for arbitrary integral H; these results may be compared with (3.5) 
and (3.6). 

As another application, we may prove that’ for fixed k there are infinitely many 
Bm with fractional part 






(3.9) 






In general we take 







where ¢ = 1 (mod nk) but otherwise arbitrary; for p" = k = 2, we take 


m=2°'4 2 t = 1 (mod 2). 






We remark that in either case d as defined by (3.4) is a multiple of p™ — 1 so 
that D?*’ reduces to D, . 








4. Series with a multiplication theorem. The form (1.4) is suggested by a 
certain class of series f(u). The simplest case is given by A; = 1 which implies J 
D; = 1; in this case we have 


f(xu) = af(u) — f’’(u), 


More generally, consider series f(u) satisfying 








as is easily verified. 





(4.1) fizu) = sf(u) + > (=1) yif?™"(u), 





where the y; are integral. Using (1.1) we get 








(4.2) av? FR, 
5 . j=l bad a 






pri 
A t—i» 









so that the A; are integral. In the second place, (1.2) and (4.1) imply 






(4.3) 2h(u) = A(zu) + >> (=1)a(yju?”?). 






* Compare II, p. 67. 








ice that 


h (3.5) 


y many 








THE RECIPROCAL OF CERTAIN SERIES 241 


Substituting from (1.4) we get 


(4.4) D; = > (—1)*? Lin aw". 
j=l Le 


so that the D; are integral also. Hence if f(u) satisfies (4.1) its inverse is cer- 
tainly of the form (1.4); however, the converse is not aways true. 

When f(u) satisfies (2.1) it follows that for W = W(zx), a polynomial in z of 
degree w, we have 


we 


(4.5) f(Wu) = Wf(u) + > (—1)' ¥(W)f?""(u), 


where y;(W) is integral. 
In (4.4) put i = k + m, and let P as usual denote an irreducible polynomial 


of degree k. Then we get 


Dim = (= 1) FE Die (mod P) 
i= t-+m—j 
m—j) 


If we assume D,,; = D,D; for j < m, it follows that 
Dizm = Dz Do (-1)" ee ma 
and therefore applying (4.4) again we have 
(4.6) Dysm = Dp Dn (mod P), 
Repeated application of (4.6) leads to the more general 
(4.7) Diiam = DiDan (mod P) 


fori = 1,m = 0. This result may be compared with (2.7) and (2.8). Evi- 
dently (4.7) is a necessary condition that the series (1.1) have a multiplication 
theorem. 

An application of a different nature may be made to the study of the coeffi- 


cients 8,.(W) appearing in the expansion 

Wu) SA u 
4.8 fi = (WW) —. 
(4.8) Wf(u) p> Bu(W) Jm 


It follows from (4.5) that W2,,(W) is integral; it is not difficult to extend the 
results of §3 to the case of B,,(W). 


5. A special case. We consider in more detail the case A; = 1. It will be 
convenient to use the fuller notation 


4» CARLITZ 


v(u, p") = > (-1)' é. ue 
k=0 F, 


Then ¥(u, p’") satisfies (4.1) with s = 2,7, = 0,72 = —1. While Theorems 
1 and 2 apply immediately relative to the larger coefficient field GF(p*"), it is 
of some interest to examine the result relative to GF(p"). Evidently (1.4) 
becomes 


2n = l ank l Lax(p") Qnk 
r = Pp = ek\ Pp 
(u, p") > Lp) u m Tan (pr) Lalp™) y 


so that Dx, = 0 and 


(5.1) Dy, = LP") & [2% — 1Jfar — 3) --- (1), 
L.(p*") 

where 

(5.2) L,(p"") = [2k][2k — 2] --- [2], [k] 

We now determine the residue of Dy, (mod P), where P is irreducible in GF (p") 
of degree 2k. For simplicity we suppose p ~ 2. 

Now put 

(5.3) ga RO Oe (mod P). 
a = a 


= —|[k], we have 


P= (-1' Rr, 


Since [k]”" = 


which is congruent to a number of GF(p"). However, by (5.3), 
get = fy = -1, 

so that é is not congruent to a number of GF(p"). Thus we get 

(5.4) e = », 


where »y is a non-square in GF(p"). 
Next by a known theorem we have 


(5.5) P(x) = A*(x) — vB*(z2), 
where deg A = k, deg B < k; to fix A assume that the coefficient of 2‘ is 1. 
On the other hand from 
P(u) = (u — z)(u — 2”) «+ (u— or”) (mod P(z)) 
we get 
(u—2z)(u—2")---(u—- 2" ”) = A(u) — EB(u), 
(u—2™)(u — 2") --- (u— or’) = A(u) + EB(u), 


thus incidentally fixing B. Now put u = z and we have at once 





orems 
, it is 


(1.4) 


THE RECIPROCAL OF CERTAIN SERIES 


(5.6) 2A(x) = 2EB(x) = (—1)*[1][3] --- [2k — 1]. 
Hence from (5.1) it is evident how Dy may be expressed in terms of the poly- 
nomial A defined by (5.5). As a consequence (3.3) becomes in the present case 


d+1 
Y om 


Bm = Gn — e 


agrean PF’ 


where now k is even and A is determined for each P by means of (5.5); e and d 
as before are given by (3.4). More general results of this nature are left for 
another paper. 


Duke UNIVERSITY. 














PERIODICALS PUBLISHED BY DUKE UNIVERSITY 


American Literature. A quarterly journal devoted to research in American 
Literature, published with the codperation of the American Literature 
Group of the Modern Language Association of America. Subscription, 
$4.00 per year. Back volumes, $5.00 each. 


Character and Personality. A psychological quarterly devoted to studies of 
behavior and personality. Subscription, $2.00 per year. The first number 
was published September, 1932. 


Contributions to Psychological Theory. A monograph series dealing with prob- 
lems of psychological theory in the widest sense, including their rela- 
tions to other fields of inquiry. The monographs appear irregularly. 
Subscription, $5.00 per volume of approximately 450 pages. 


Duke Mathematical Journal. 


Ecological Monographs. A quarterly journal devoted to the publication of 
original researches of ecological interest from the entire field of biological 
science. Subscription, $6.00 per year. The first number was published 
January, 1931. 


Hispanic American Historical Review. A quarterly review dealing with the 
history of the Latin-American countries. Subscription, $4.00 per year. 


Law and Contemporary Problems. A quarterly published by the School of 


Law, presenting in each issue a symposium on a problem of current im- 
portance having significant legal aspects. Subscription, $2.00 per year. 
The first number was published September, 1933. 


’ The South Atlantic Quarterly. A magazine of modern opinion and discussion, 
founded in 1902. Subscription, $3.00 per year. 


The Southern Association Quarterly. As official organ of the Southern Asso- 
ciation of Colleges and Secondary Schools, it contains the proceedings of 
the annual meeting, together with much additional material directly re- 
lated to the work of the Association. Subscription, $4.00 per year. 


DUKE UNIVERSITY PRESS 
DURHAM, NORTH CAROLINA 








CONTENTS 


A comparison of linear measures.in the plane. By Szymour SHERMAN.... 
Limits of integrals. By Rate Parmer AGNEW 
Classification of solutions and of pairs of solutions of y’”’ + 2py’ + p’y = 0 
by means of initial conditions. By Josern J. Eacnus 
Structure and continuity of measurable flows. 
By WARREN Amprose and Sxizvo Kaxvurani 25) 
The decomposition of measures, IT. 
By Warren Ambrose, Pavt R. Haumos, and Saizvo KakuTani 43) 
The Fuchsian equation of second order with four singularities. 
By A. Expé.y1 48° 
A generalization of the Euclidean algorithm to several dimensions. 
By Barxiey Rosser 
Positive definite functions on spheres. By I. J. ScnomnBERG 
The analytic prolongation of a minimal surface. By E. F. Beckenpacn. . 
Additive functions and almost periodicity. 
By Pair Hartman and Aure. WINTNER 112 
A correction to a previous paper. By Caarizs B. Morrey, Jr 
A general Kummer theory for function fields. 
By Saunprers Mac Lanse and O. F. G. SommiNa 12) 
Absolute Nérlund summability. By Laonarp McFappEn 58. 
Associated double integral variation problems. By Ear. J. Mickie 
Fundamental theorems of a new mathematical theory of rtp 


The reciprocal of certain series. 





